System for Information Extraction from News Sites

Tihomir Stefanov


The present paper deals with a system for crawling and content extraction from news sites. The system of web crawlers extracts textual and graphic information and checks for multimedia content availability. A part of the programming code and the database have been presented.


source code; information extraction; databases; information sites

Full Text:



Georgieva-Trifonova, Ts., Stefanov, T., Applying linked data technologies for online newspapers, International Journal of Advanced Computer Science and Applications, Vol. 6, No. 5, 2015, p. 29 – 33, Digital Object Identifier (DOI) : 10.14569/IJACSA.2015.060505, The Science and Information (SAI) Organization, ISSN 2156-5570.

Stefanov, T., Methods for Assessing Information Sites, XLVII International Scientific Conference on Information, Communication and Energy Systems and Technologies ICEST'12, Bulgaria, Veliko Tarnovo, 28 – 30 June 2012, Vol.2, p. 455 – 458.

Stefanov, T., Tsvetkov, D., A Model for Evaluation of Regional Electronic Media in terms of Efficiency Criteria and User Satisfaction, Collection of Writings ‘Days of Science 2014’, Veliko Turnovo, 2014, in press.

Toleva–Stoimenova, St., Christozov, D., Informing via Websites: Comparative Assessment of University Websites, Issues in Informing Science and Information Technology (IISIT), Vol. 10, 525-537, 2013.


  • There are currently no refbacks.
We use cookies.