Notes
Slide Show
Outline
1
Web- Mining: an Introduction
2
Background & Motivation
  • It is a vast distributed pool of semi-structured information.


  • Explosive growth in the amount of data has led us:
  • Manage large volume of data to present in a structured and in an orderly way.
  • Intelligently find information resources.
  • Analyze/Track Usage patterns.


  • Answer/Solution: Web-Mining.
3
What is Web Mining?

  • WWW



4
Web-Mining


  • The key objective is to develop “more” intelligent tools for information retrieval to help the user in finding, extracting, filtering & evaluating the desired information and resources.


  • Development of algorithms.
5
Evolution of Web-Mining (or Different Paradigms)
  • Started in the early 90’s to observe user behavior (viewing, book-marking, browsing history)


  • Some people were interested in understanding the web-content (Textual-information)


  • There is a powerful philosophy-that if we understand the ontology of the web-site we will be able to generate a KB or DB that reflect this ontology. (I am not sure how we will do it, but I guess it can be done!.)
6
“Web” (WWW)
  • It involves 3 kinds of data:


  • Data on the web (Web-Content).


  • Web log data regarding the users who browsed the page.


  • Web-structure data.



7
Web-Mining Taxonomy
  • We can broadly make it as:


8
Web-Mining Approaches
  •                                    Two approaches



9
Web-Mining Approaches
  • Information Retrieval Approach
  •  - To assist or improve the information finding to the users based on either inferred or solicited user profiles.



  • Database Approach
  •  - To model the data on the web and to integrate them.
  •  - The philosophy being that more sophisticated queries other than the keywords could be performed.
10
Database Approach
  • Absent of machine learning or data mining techniques in the process.


  • Its approach is illustrated below: