diff --git a/README.md b/README.md index 19765808..ac1c0aa5 100644 --- a/README.md +++ b/README.md @@ -303,6 +303,8 @@ AutoParseCrawler is also an Executor plugin, a Requester plugin and a Visitor pl + Just override the corresponding methods of your AutoParseCrawler. For example, if you are using BreadthCrawler, all you have to do is override the `Page getResponse(CrawlDatum crawlDatum)` method. + Create a new class which implements Requester interface and implement the `Page getResponse(CrawlDatum crawlDatum)` method of the class. Instantiate the class and use `crawler.setRequester(the instance)` to mount the plugin to the crawler. +### Just to be clear,Using the okHttp3 plugin, which by default is Gzip compressed, the data returned by the request is automatically decompressed.If you manually set accept-encoding, you'll need to decompress the data returned by the request, and okHttp3 won't help you decompress the data. + ## Customizing Requester Plugin @@ -368,4 +370,4 @@ Element contentElement = ContentExtractor.getContentElementByUrl(url); ## Other Documentation + [中文文档](https://github.com/CrawlScript/WebCollector/blob/master/README.zh-cn.md) ---> \ No newline at end of file +-->