html scraping

Checking websites for updates

Lots of websites get updated on an irregular basis. Some may get updated every week, others get updated every hour; some may not get updated for month and users are bored looking up the website regulary and they do not find any new content. In addition, checking websites manually is not very efficient and costs a lot of time. The solution for this problem should provide an easy way to check websites and stay updated. It would be helpful if the tool is able to highlight changes. In addition, the tool should be able to exclude several areas of a website from checking, for example the current time or the "who's online" section of the website. These changes might not be interesting for the user. The tool should of course work on HTML files; it also would be good if changes in images can be detected using a tool. The Solution should also cover non-HTML content, especially flash. Flash is used a lot in the www and this technology should not be left out. However, a solution for HTML at least would work for most websites. CSS changes do not need to be included as CSS should not contain any content but just formatting information.
Subscribe to html scraping