IT Blog

  • Blog
  • Technology
    • Technology
    • Architecture
    • CMS
    • CRM
    • Web
    • DotNET
    • Python
    • Database
    • BI
    • Program Language
  • Users
    • Login
    • Register
    • Forgot Password?
  • ENEN
    • 中文中文
    • ENEN
Experience IT
In a World of Technology, People Make the Difference.
Technology

Crawling 51job in Python

Summary The data is stored in javascript variable in 51job website. The crawling could not get right data directly from html, but javascript. Using json plugin to achive the goal. Website: https://search.51job.com/ Search result Html source The data is not in the page html, but javascript code in json format. Xpath Using xpath to look for data on the page is fine, but in the code it returns nothing, because the data is in javascript variable. contentx = le.HTML(content) rets = contentx.xpath('//div[@class="e"]//p[@class="t"]') # returns nothing Crawling with python Using regular expression to parse out the jason data in javascript. ret = re.findall('window.__SEARCH_RESULT__ = (.*?)</script>', content_str)[0] Use json plugin to load ret into json object. d = json.loads(ret) Finally, get the data from json array. Complete code import requests import lxml.etree as le import re import json content = requests.get( url='https://search.51job.com/list/010000,000000,0000,00,9,99,python,2,1.html?lang=c&postchannel=0000&workyear=99&cotype=99&degreefrom=99&jobterm=99&companysize=99&ord_field=0&dibiaoid=0&line=&welfare=', ).content content_str = content.decode('gbk', 'ignore') # <meta http-equiv="Content-Type" content="text/html; charset=gbk"> ret = re.findall('window.__SEARCH_RESULT__ = (.*?)</script>', content_str)[0] d = json.loads(ret) # <class 'dict'> {'top_ads': [], 'auction_ads': [], 'market_ads': [], 'engine_search_result': [{'type': 'engine_search_result', results = d['engine_search_result'] for result in results: print(result)    2,017 total views

2020-08-28 0 Comments 641 Views 0 Like IT Team Read more
Plugin

WordPress EventCalendar visibility

Based on the official knowledge base, currently there is no “out of the box” way to control events visibility — no plugin setting,  plugin filter,  etc.  But with a bit of custom coding, you can get pretty close. 1. Hide event from month view if the use is not logged in. Ref: https://theeventscalendar.com/knowledgebase/k/making-a-members-only-calendar/ 1. create a event category: Agent Assign the event to the category. 2. add stylesheet, e.g. Custom CSS & JS plugin. .type-tribe_events.tribe-events-category-agent { display: none; } body.logged-in .type-tribe_events.tribe-events-category-agent { display: inline; } 2. disable popup tip window. ref: https://theeventscalendar.com/support/forums/topic/disabling-the-calendar-pop-up-views/ body .tribe-events-calendar .tribe-events-tooltip { visibility:hidden; } @media (max-width: 600px) { body .tribe-events-calendar .tribe-events-tooltip { visibility:hidden; } }    971 total views

2020-08-20 0 Comments 252 Views 0 Like IT Team Read more
BI

Python Data Visualization Practise

Python Code import json import requests # 3rd party import jsonpath from pyecharts.charts import Map from pyecharts import options as opts from src.covid19_data import nameMap url = 'https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist' resp = requests.post(url).text data = json.loads(resp) # get contry names and confirmed cases $:outer {}, ..: fuzzy match + key name = jsonpath.jsonpath(data, "$..name") # ['美国', '巴西', '印度', '俄罗斯', ... ] confirm = jsonpath.jsonpath(data, "$..confirm") pairs = list(zip(name, confirm)) # [('美国', 5321520), ('巴西', 3109630), ('印度', 2329638), ...] # 2. data visualization map plotting - size,title,color,data map_ = Map(opts.InitOpts(width='1200px', height='600px')) .add(series_name="Global Covid-19 Confirmed Cases", data_pair=pairs, # data entry - country : confirmed maptype="world", # world map name_map=nameMap, # country name mapping is_map_symbol_show=False) # 不显示标记点 # set series options map_.set_series_opts(label_opts=opts.LabelOpts(is_show=False)) map_.set_global_opts(title_opts=opts.TitleOpts(title="Global Covid-19 Pandemic (Chinese)"), visualmap_opts=opts.VisualMapOpts(max_=6000000, is_piecewise=True)) map_.render('../result/covid19.html') covid19_data nameMap = { 'Singapore Rep.': '新加坡', 'Dominican Rep.': '多米尼加', 'Palestine': '巴勒斯坦', 'Bahamas': '巴哈马', 'Timor-Leste': '东帝汶', 'Afghanistan': '阿富汗', 'Guinea-Bissau': '几内亚比绍', "Côte d'Ivoire": '科特迪瓦', 'Siachen Glacier': '锡亚琴冰川', "Br. Indian Ocean Ter.": '英属印度洋领土', 'Angola': '安哥拉', 'Albania': '阿尔巴尼亚', 'United Arab Emirates': '阿联酋', 'Argentina': '阿根廷', 'Armenia': '亚美尼亚', 'French Southern and Antarctic Lands': '法属南半球和南极领地', 'Australia': '澳大利亚', 'Austria': '奥地利', 'Azerbaijan': '阿塞拜疆', 'Burundi': '布隆迪', 'Belgium': '比利时', 'Benin': '贝宁', 'Burkina Faso': '布基纳法索', 'Bangladesh': '孟加拉国', 'Bulgaria': '保加利亚', 'The Bahamas': '巴哈马', 'Bosnia and Herz.': '波斯尼亚和黑塞哥维那', 'Belarus': '白俄罗斯', 'Belize': '伯利兹', 'Bermuda': '百慕大', 'Bolivia': '玻利维亚', 'Brazil': '巴西', 'Brunei': '文莱', 'Bhutan': '不丹', 'Botswana': '博茨瓦纳', 'Central African Rep.': '中非共和国', 'Canada': '加拿大', 'Switzerland': '瑞士', 'Chile': '智利', 'China': '中国', 'Ivory Coast': '象牙海岸', 'Cameroon': '喀麦隆', 'Dem. Rep. Congo': '刚果(金)', 'Congo': '刚果(布)', 'Colombia': '哥伦比亚', 'Costa Rica': '哥斯达黎加', 'Cuba': '古巴', 'N. Cyprus': '北塞浦路斯', 'Cyprus': '塞浦路斯', 'Czech Rep.': '捷克', 'Germany': '德国', 'Djibouti': '吉布提', 'Denmark': '丹麦', 'Algeria': '阿尔及利亚',…

2020-08-12 0 Comments 834 Views 0 Like IT Team Read more
SQL Server

concat all columns in SQL Server

Joining all column values into one string with concat function is a pain, especially for a table with large numbers of columns. This script concats all columns quickly. declare @sql varchar(max) SELECT @sql = ISNULL( @sql+ '' + '+'',''+', '') + 'convert(varchar(50), [' + c.name + '])' FROM sys.all_columns c join sys.tables t ON c.object_id = t.object_id WHERE t.name = 'herb' and c.name!='id' EXEC( 'SELECT ' + @sql + ' FROM herb') Result The script generates long string value of @sql as following: convert(varchar(50), [alias])+','+convert(varchar(50), [application])+','+convert(varchar(50), [attachment])+','+convert(varchar(50), [category])+','+convert(varchar(50), [caution])+','+convert(varchar(50), [chemical])+','+convert(varchar(50), [desc])+','+convert(varchar(50), [english])+','+convert(varchar(50), [enterdate])+','+convert(varchar(50), [extract])+','+convert(varchar(50), [form])+','+convert(varchar(50), [form1])+','+convert(varchar(50), [grow])+','+convert(varchar(50), [guijing])+','+convert(varchar(50), [image_all])+','+convert(varchar(50), [image_leaf])+','+convert(varchar(50), [image_root])+','+convert(varchar(50), [indication])+','+convert(varchar(50), [indication1])+','+convert(varchar(50), [locations])+','+convert(varchar(50), [medicalprop])+','+convert(varchar(50), [name])+','+convert(varchar(50), [notes])+','+convert(varchar(50), [origin])+','+convert(varchar(50), [pharmacology])+','+convert(varchar(50), [pinyin])+','+convert(varchar(50), [process])+','+convert(varchar(50), [source])+','+convert(varchar(50), [store])+','+convert(varchar(50), [url])+','+convert(varchar(50), [usage])  1,863 total views

2020-05-31 0 Comments 963 Views 1 Like IT Team Read more
DotNET

Crawler - Split column

Problem The second line contains pingying and alias. html code: Zhōngfǔ&nbsp;&nbsp;&nbsp;肺之募穴 First try get data roughly: alias = p.Select(Selectors.XPath("//span[@style]"))?.Value.Replace("&nbsp;", " "); Result: Zhōngfǔ   肺之募穴 Split the alias column in MySQL: UPDATE `xuewei` set pingying=substring_index(alias,' ',1); UPDATE `xuewei` set bieming=substring_index(alias,' ',-1); Split the column in process: alias = p.Select(Selectors.XPath("//span[@style]"))?.Value.Replace("&nbsp;", " "); var arr = alias.Split(' '); pingying = arr[0]; alias = arr[arr.Length-1]; //get last item in the array for the space in between is uncertain. Save to database: await conn.ExecuteAsync( $"INSERT IGNORE INTO xuewei (name,pingying,alias,position,anatomy,indication,operation,imagegeneral,imagepoint) VALUES " + $"('{data.Name}', '{data.Pingying}', '{data.Alias}', '{data.Position}', '{data.Anatomy}', '{data.Indication}', '{data.Operation}', '{data.ImageGeneral}', '{data.ImagePoint}');");  1,433 total views

2020-05-18 0 Comments 596 Views 0 Like IT Team Read more
DotNET

Crawler - get data from unwrapped html content

Content data on webpage: It's html code looks like this: Parsing data using regular expression with named group. Use foreach loop to avoid the order change on page. Do not use the text value from DOM function/property, because the text value does not contain html tag, which gives you the necessary mark on data.  13,495 total views

2020-05-17 3 Comments 3288 Views 0 Like IT Team Read more
Architecture

Visual Paradigm v16.1 Installation

Download Visual Paradigm Community https://www.visual-paradigm.com/download/community.jsp?platform=windows&arch=64bit Install Visual Paradigm Online Training https://www.visual-paradigm.com/training/visual-paradigm-essential/ User Guide User Guide  2,047 total views

2020-05-14 0 Comments 701 Views 0 Like IT Team Read more
DevOps

Install SQL Server 2019, Data tools etc.

SQL Server 2019 Installation Download en_sql_server_2019_developer_x64_dvd_baea4195.iso file from MSDN Subscription. Mount iso file to file system. Run Setup.exe from the mounted CD-ROM. Don't worry about the warning, just ensure the ports are open if you need remote access. Now the installation is completed and ready to use. Let's see other options in the installation SSMS - SQL Server Management Studio Download SQL Server Management Studio (SSMS) SSAS,SSRS,SSIS Install SSDT Launch Visual Studio Installer Install SSAS, SSRS Project Templates Open Manage Extensions from Visual Studio menu. Search for ssdt. When you close Visual Studio the installation started automatically. Install SSIS Project templates Click on Download button, the linked browser will be open for download, or use this link to download Microsoft.DataTools.IntegrationServices.exe file. https://marketplace.visualstudio.com/items?itemName=SSIS.SqlServerIntegrationServicesProjects Done Now open Visual Studio, the newly added project templates are shown in the list.  3,663 total views

2020-05-13 0 Comments 1217 Views 0 Like IT Team Read more
Source control

github failed to push to the remote repository. see the output window for more details

Problem Clicked Revert button by mistake, and then make change on code, try to Commit /Push, got failed. Look at the Commit details, it looks like this: Error message "github failed to push to the remote repository. see the output window for more details". Fix Go to settings -> Emails, uncheck "Block command line pushes that expose my email" option. Successfully pushed.  4,211 total views

2020-05-11 0 Comments 1074 Views 0 Like IT Team Read more
DotNET

String concatenation in c# and php

C# Using + operator Console.WriteLine("Hello" + " " + "String " + "!"); String Interpolation string author = "Mahesh Chand"; string book = "C# Programming"; string bookAuthor = $"{author} is the author of {book}."; String.Concatenate() method string fName = "Mahesh"; string lName = "Chand"; string Name = string.Concat(fName, lName); string[] authors = { "Mahesh Chand ", "Chris Love ", "Dave McCarter ", "Praveen Kumar "}; string arrayStr = string.Concat(authors); String.Join() method int[] intArray = { 1, 3, 5, 7, 9 }; String seperator = ", "; string result = "Int, "; result += String.Join(seperator, intArray); // Using String.Join(String, String[], int int) // Let's concatenate first two strings of the array String[] arr2 = { "Mahesh Chand ", "Chris Love ", "Dave McCarter ", "Praveen Kumar " }; String seperator2 = ", "; string result2 = "First Author, "; result2 += String.Join(seperator2, arr2, 1, 2); Console.WriteLine($"Result: {result2}"); String.Format() method string date = String.Format("Today's date is {0}", DateTime.Now); StringBuilder.Append() method builder.Append(", "); PHP Using . operator $b = "Hello " . "World!"; // slow String Interpolation $a = '3'; echo "qwe{$a}rty"; // double quote echo "Result: " . ($a + 3); // result 6 "{$str1}{$str2}{$str3}"; // one concat = fast   $str1. $str2. $str3;    // two concats = slow //Use double quotes to concat more than two strings instead of multiple '.' operators.  PHP is forced to re-concatenate with every '.' operator. $logMessage = "A {$user->type} with e-mailaddress {$user->email} has performed {$action} on {$subject}." Using printf() $logMessage = sprintf('A %s with email %s has performed %s on %s.', $user->type, $user->email, $action, $subject);    1,907 total views

2020-04-29 0 Comments 875 Views 0 Like IT Team Read more
1…34567…18
Chinese (Simplified) Chinese (Simplified) Chinese (Traditional) Chinese (Traditional) English English French French German German Japanese Japanese Korean Korean Russian Russian
Newest Hotspots Random
Newest Hotspots Random
All Unicode Chars How to keep and display contact form 7 data Common Regular Expressions How to block bad bots in Appache server Add ALT attribute to Social Media Icon Change Event Calendar Title
All Unicode Chars
Tracking referral ID for registered user in WordPress How to keep and display contact form 7 data Temporary tables lifetime Show any data in DataTables in WordPress Send report in the attachment from database Generate city list for UPDATE statement in MySQL
Categories
  • Architecture
  • BI
  • C#
  • CSS
  • Database
  • DotNET
  • Hosting
  • HTML
  • JavaScript
  • PHP
  • Program Language
  • Python
  • Security
  • SEO
  • Technology
  • Web
  • Wordpress

COPYRIGHT © 2021 Hostlike IT Blog. All rights reserved.

This site is supported by Hostlike.com