Information mining isn’t screen-scraping. I know that some people in the room may differ with that statement, but they’re really two almost completely different concepts.
To put it briefly, you might state it this way: screen-scraping allows you to get information, where information mining allows you to analyze information. That’s a pretty big simplification, so Factors . elaborate a bit.
The term “screen-scraping” comes from the old mainframe terminal days exactly where people worked on computers with eco-friendly and black screens containing just text. Screen-scraping was used to remove characters from the screens so that they might be analyzed. Fast-forwarding to the web world of today, screen-scraping now most commonly refers to extracting information from web sites. That is, computer programs can “crawl” or “spider” through web sites, pulling out data. People often do this to build things such as comparison shopping engines, archive web pages, or simply just download text to a spreadsheet in order that it can be filtered and analyzed.
Information mining, on the other hand, is defined simply by Wikipedia as the “practice of immediately searching large stores of information for patterns. ” In other words, you already have the data, and you’re at this point analyzing it to learn useful reasons for it. Data mining often entails lots of complex algorithms based on statistical methods. It has nothing to do with how you got the data in the first place. In data mining you only care about analyzing can be already there.
The difficulty is that people who don’t know the term “screen-scraping” will try Googling for anything that resembles it. We include a number of these terms on our web site to help such folks; for example , we created pages entitled Text Data Mining, Automated Data Selection, Web Site Data Extraction, and even Web Site Ripper (I suppose “scraping” is sort of like “ripping”). So it presents a bit of a problem-we don’t necessarily want to perpetuate a misconception (i. e.
If you cherished this report and you would like to obtain extra data regarding scrape google kindly stop by our own web site.
, screen-scraping = data mining), but we all also have to use terminology that people can actually use.