Skip to content

tiwariayush/python-web-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python-web-extractor

Scrapes data from various e-coomerce websites that do load their data by javascript . Till now ,it extracts from online marketting websites namely - amazon ,flipkart, jabong and myntra .

It contains 2 main scripts , namely extract.py and webkit_product_info.py The 1st one provides data with cache which can be used later .

To get the product info , just give a call to the fuction in the module.For example - from extract import * info = extract('http://www.flipkart.com/wills-lifestyle-men-s-printed-casual-shirt/p/itmduesu6zf6hrzn?pid=SHTDUESURJDHFWVB&srno=b_3&ref=dce83d46-bb86-4fd3-8000-140be7fc60d5') print info

or use the info as per your wish .

#Contributors

Ayush Tiwari

#Dependencies

#Installation/Usage

  • Fork and clone the repository.
  • Move the files extract.py and webpage_xpath.csv to your app directory .
  • Pass the function extract(url) to get the data and use as per your wish.
  • For extracting more data from other websites , just add the xpaths in webpage_xpath.csv .

#License

python-web-extractor is licensed under the [MIT license.]

About

Scrapes data from websites that do load their data by javascript .

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages