recentpopularlog in


« earlier   
Web Scraping with lxml: What you need to know
In this post, you will learn how to use lxml and Python to scrape data from Steam. I will teach you the basics of XPath so that you can scrape data from any similar website easily. In the end, you will also learn how to generate a JSON output from your script. So what are you waiting for? Let's begin!
python  screen  scraping  4* 
3 days ago by ianweatherhogg
GitHub - postlight/mercury-parser: 📜 Extracting content from the chaos of the web.
📜 Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.
github  scraping  parsing  javascript 
5 days ago by synergyfactor
GitHub - CU-ITSS/Web-Data-Scraping-S2019
This is a five-week one-credit "mini-course" on retrieving ("scraping") data from the web. The course is intended for researchers in the social sciences and humanities with computational instincts but limited or no prior programming experience. Each class will be 2.5 hours long: we'll take a break mid-way for biological input and output. Lectures will use a combination of lecture-by-notebook as well as hands-on exercises. The end of each class will have links to resources and additional take-home exercises. Students will have the option of presenting their solutions to the take-home exercises at the beginning of the next class.
scraping  tutorial  python  notebook 
5 days ago by paulbradshaw
Pricing |
*expensive* intelligent data scraping from thousands of websites
data  aggregators  webpages  scraping 
6 days ago by GreggInCA
HiQ v. LinkedIn and the Legality of Web Scraping – Knowmad Law – Medium
hiQ Labs is a data science company that develops tools to help corporate HR departments keep tabs on their workforces. Included in the information hiQ provides to its clients is data scraped from their employees’ public LinkedIn profiles. On May 23, 2017, LinkedIn sent hiQ a cease and desist letter demanding that they stop the practice. Two weeks later, hiQ filed a lawsuit in the Northern District of California, asking the court for a declaratory judgment that scraping LinkedIn’s data was lawful.

On August 14, 2017, the judge granted hiQ’s motion for a temporary restraining order preventing LinkedIn from blocking hiQ’s access to their site while the case was pending, the decision which LinkedIn then appealed to the Ninth Circuit.

In the meantime, scraping has taken on a new political dimension. Mark Zuckerberg’s awkward two-day testimony before Congress last week was necessitated largely by the accusation that Facebook has failed to protect its users’ data from collection by predatory third parties such as Cambridge Analytica.
scraping  linkedin  hiq  law 
7 days ago by paulbradshaw

Copy this bookmark:

to read