You are here: Home Research Web Mining and Web Intelligence

Web Mining and Web Intelligence

Data Mining for Web


People involved

Alí­pio Jorge (coord.), Nuno Escudeiro, Ricardo Campos, João Vinagre and Jorge Morais.

Activites

  • Recommendation and Web adaption: Development of tools and algorithms for web recommendation and adaptation. We study recommender systems for binary feedback, exploitation of background and context information for recommender algorithm enhancement and incremental approaches to recommender algorithms. Combination of usage, content and structure information in recommender models. We use the techniques of collaborative filtering, association rules, among others. We currently have one application in production at the site PalcoPrincipal.pt for the recommendation of musical tracks.

  • Information retrieval and extraction: Retrieving and extracting relevant information from the web is demanding task, that can only be done with the support of automatic tools. We investigate the use of machine learning and data analysis techniques that would facilitate this process. We also explore temporal information in web content to associate implicit temporal information with queries.

  • Web / Content Management Automation: This sub-area deals with the automation of tasks related to web site maintenance, from the point of view of content, structure and presentation. We have developed EdMate, a methodology and tool for assisting editors of content bases, such as web portals. This work has been done in collaboration with the company PortalExecutivo. This tool is based on the analysis of content meta-data and web logs.


Projects

  • Site-o-Matic - Web Site Automation (Project POSI / EIA / 58367/ 2004)

  • Mail-maid: automatic mail classification platform - lead by Nuno Escudeiro and currently with a pilot version running in APPIA web server.
  • Knowledge Base (CRM/Data Mining) – Development of algorithms for implicit data analysis using artificial intelligence methods -project (SIME 80/00666), in collaboration with PortalExecutivo.com(2002-2003).


Main Publications

  • Carlos Soares, R. Ghani (eds.). Data Mining for Business Applications: Data Mining for Business Applications. This book published by IOS Press (2010) includes some relevant chapters in this area.
  • Ana Catarina Miranda, Alipio M. Jorge, Item-Based and User-Based Incremental Collaborative Filtering for Web Recommendations Progress in Artificial Intelligence, Proceedings of the 14th Portuguese Conference on Artificial Intelligence (EPIA 2009), Springer LNCS Volume 5816, page 673--684 - October 2009
  • Nuno Escudeiro, Alipio M. Jorge, Efficient Coverage of Case Space with Active Learning, Progress in Artificial Intelligence, Proceedings of the 14th Portuguese Conference on Artificial Intelligence (EPIA 2009), Springer LNCS Volume 5816, page 411--422 - October 2009
  • Ana Catarina Miranda, Alipio M. Jorge, Incremental collaborative filtering for binary ratings, Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Volume 1, page 389--392 - December 2008
  • Marcos Aurélio Domingues. An independent platform for the monitoring, analysis and adaptation of web sites. Proceedings of the 2008 ACM Conference on Recommender Systems (RecSys 2008), page 299--302 [ISI, DBLP]
  • Carmen Rebelo, Pedro Quelhas Brito, Carlos Soares, Alipio M. Jorge, Quantitative Evaluation of Clusterings for Marketing Applications: a Web Portal Case Study, Progress In Artificial Intelligence, Proceedings of the 13th Portuguese Conference on Artificial Intelligence Workshops (EPIA 2007), Springer LNCS Volume 4874, page 437--448 - December 2007
  • Nuno Escudeiro, Alipio M. Jorge, Semi-automatic creation and maintenance of Web Resources with webTopic, Semantics, Web and Mining - Proceedings of the Joint International Workshop on European Web Mining Forum (EWMF 2005) and the Workshop on Knowledge Discovery and Ontologies (KDO 2005), Springer LNCS, Volume 4289, page 82--102 - 2006
  • Ana Costa e Silva, Alipio M. Jorge, Luís Torgo, Design of an end-to-end method to extract information from tables International Journal on Document Analysis and Recognition, Volume 8, Number 2-3, page 144--171 - June 2006
  • Carlos Soares, Alí­pio Jorge, and Marcos Aurélio Domingues, Monitoring the Quality of Meta-Data in Web Portals Using Statistics, Visualization and Data Mining, Proceedings of EPIA 2005, LNCS, Vol. 3808, pp 371 - 382, Springer, December 2005

    PhD. Theses Completed

    2010

    • Marcos Aurélio Domingues, "Exploiting Multidimensional Data for Web Site Automation", PhD in Computer Science, University of Porto. Sup. Alípio Jorge and Carlos Soares.

    Recent MSc. Theses Completed

    2010

    • Hugo Dias, Adaptabilidade Web 2.0, Mestrado em Engenharia de Redes e Sitemas Informáticos, Faculdade de Ciências, Universidade do Porto, 2010, sup. José Paulo Leal and Alípio Jorge
    • João Vinagre, Forgetting mechanisms for scalable collaborative filtering, Mestrado em Engenharia de Redes e Sitemas Informáticos, Faculdade de Ciências, Universidade do Porto, 2010, sup. Alípio Jorge.
    • Daniel Gonçalves, "Modelo Generalista de páginas Web para Motores de Pesquisa com Interface 3D", Mestrado em Engenharia Informática, ISEP-IPP, sup. Nuno Escudeiro.
    • Alexandra Alves, "Modelo de representação de texto mais adequado à classificação", Mestrado em Engenharia Informática, ISEP-IPP, sup. Nuno Escudeiro.
    • Filipe Fortuna: “Recomendação de produtos em sítios web: um caso de estudo” (Product recommendation in web sites: a case study), MADSAD, Faculdade de Economia, Universidade do Porto, supervised by Carlos Soares.

      2009

      • Ana Carneiro, "Relação entre usabilidade e retorno de capital no comércio electrónico: aplicação ao caso Introduxi", Mestrado em Análise de Dados e Sistemas de Apoio à Decisão, Faculdade de Economia, Universidade do Porto, sup. Alípio Jorge and Pedro Quelhas Brito.
      • Luiza Gabriel (2009), Automatic email organization, MSc Informatics Engineering, ISEP, sup. Nuno Escudeiro.

      Ph. D. Theses in Progress:

      • Nuno Escudeiro on the use of active learning for text classification
      • António Jorge Morais on market based approach to recommender systems
      • Mário Amado Alves on principles and data structures suitable for adaptive hypertext bases.
      • Ricardo Campos on temporal information retrieval

      Organized Events