Reference Hub

This research has been cited in:

Article
Fast and denoise feature extraction based ADMF–CNN with GBML framework for MRI brain imageInternational Journal of Speech Technology10.1007/s10772-020-09793-w
Article
Building a Technology Recommender System Using Web Crawling and Natural Language Processing TechnologyAlgorithms10.3390/a15080272
Article
Dimensionally improved residual neural network to detect driver distraction in real timeJournal of Physics: Conference Series10.1088/1742-6596/1964/4/042037
Conference
Dataset Generator: Creating And Analyzing Customized Data2023 11th International Conference on Emerging Trends in Engineering & Technology - Signal and Information Processing (ICETET - SIP)10.1109/ICETET-SIP58143.2023.10151469
Article
Analysis and evaluation of the regional air quality index forecasting based on web-text sentiment analysis methodEnvironmental Impact Assessment Review10.1016/j.eiar.2020.106514
Article
A Deep Learning Model of Spatial Distance and Named Entity Recognition (SD-NER) for Flood Mark Text ClassificationWater10.3390/w15061197
Article
Technical Job Recommendation System Using APIs and Web CrawlingComputational Intelligence and Neuroscience10.1155/2022/7797548
Article
Potential Web Content Identification and Classification System Using Nlp and Machine Learning TechniquesSSRN Electronic Journal 10.2139/ssrn.4191836
Conference
Evaluating the Inclusiveness of Artificial Intelligence Software in Enhancing Project Management Efficiency – A review and examples of quantitative measurement methods2024 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA)10.1109/ACDSA59508.2024.10467463
Conference
Unlocking the Market Insight Potential of Data Extraction Using Python-Based Web Scraping on Flipkart2023 International Conference on Sustainable Emerging Innovations in Engineering and Technology (ICSEIET)10.1109/ICSEIET58677.2023.10303328
Article
Engineering Web Augmentation software: A development method for enabling end-user maintenanceInformation and Software Technology10.1016/j.infsof.2021.106735
Article
Application of Text Rank Algorithm Fused With LDA in Information Extraction ModelIEEE Access10.1109/ACCESS.2023.3296141
Conference
Comparative Study Of Various Scraping Tools: Pros And Cons2022 IEEE Delhi Section Conference (DELCON)10.1109/DELCON54057.2022.9753358
Chapter
Transformers and Attention Mechanism for Website Classification and Porn DetectionNew Trends in Database and Information Systems10.1007/978-3-031-42941-5_13
Chapter
Web Scraping Methods Used in Predicting Real Estate PricesAdvances in Computational Collective Intelligence10.1007/978-3-030-88113-9_30
Article
Analysis of origin, risk factors influencing COVID-19 cases in India and its prediction using ensemble learningInternational Journal of System Assurance Engineering and Management10.1007/s13198-021-01356-9
Article
A natural language interface for automatic generation of data flow diagram using web extraction techniquesJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.01.006
Article
An efficient content extraction method for webpage based on tag-line-block analysisSoft Computing10.1007/s00500-023-09076-x
Article
Assuring enhanced privacy violation detection model for social networksInternational Journal of Intelligent Computing and Cybernetics10.1108/IJICC-05-2021-0093
Conference
Web Scraping Techniques and Its Applications: A Review2023 3rd International Conference on Innovative Sustainable Computational Technologies (CISCT)10.1109/CISCT57197.2023.10351298
Article
Framing Nuclearity: Online Media Discourses in LithuaniaMedia and Communication10.17645/mac.v9i2.3818
Article
Estimating the Licensing Probabilities in the Academic Context: An Empirical AnalysisInternational Journal of Innovation and Technology Management10.1142/S0219877023500542
Conference
Influencers Selection Model Using Association Rules2022 International Visualization, Informatics and Technology Conference (IVIT)10.1109/IVIT55443.2022.10033393
Article
Content curation algorithm on blog posts using hybrid computingMultimedia Tools and Applications10.1007/s11042-022-12105-w
Article
Efficient text summarization method for blind people using text mining techniquesInternational Journal of Speech Technology10.1007/s10772-020-09712-z
Chapter
Assessment of Lifestyle and Mental Health: Case Study of the FST Beni MellalBusiness Intelligence10.1007/978-3-030-76508-8_7
Article
Feature selection method on twitter dataset with part-of-speech (PoS) pattern applied to traffic analysisInternational Journal of System Assurance Engineering and Management10.1007/s13198-022-01677-3
Chapter
A Best Price Web Scraping Application for E-commerce WebsitesProceedings of the 2nd International Conference on Cognitive and Intelligent Computing10.1007/978-981-99-2742-5_57
Chapter
Multilingual Novel Summarizer for Visually Challenged PeoplesHandbook of Research on Technologies and Systems for E-Collaboration During Global Crises10.4018/978-1-7998-9640-1.ch003
Article
Developing Accessible Websites for Differently Abled People Using Open Source ToolsInternational Journal of Software Innovation10.4018/IJSI.303576
Article
An Approach to Feature Selection in Intrusion Detection Systems Using Machine Learning AlgorithmsInternational Journal of e-Collaboration10.4018/IJeC.2020100104
Article
Collaborative Writing Tools for Predicting Verb Tense Using Syntax Parsing on Learning NetworksInternational Journal of e-Collaboration10.4018/IJeC.304042
Article
Identifying Fraudulent Behaviors in Healthcare Claims Using Random Forest Classifier With SMOTEchniqueInternational Journal of e-Collaboration10.4018/IJeC.2020100103
Article
Feature Extraction of Dialogue Text Based on Big Data and Machine LearningInternational Journal of Web-Based Learning and Teaching Technologies10.4018/IJWLTT.337602
Chapter
A PCCN-Based Centered Deep Learning Process for Segmentation of Spine and HeartHandbook of Research on Technologies and Systems for E-Collaboration During Global Crises10.4018/978-1-7998-9640-1.ch002
Chapter
Detection of Economy-Related Turkish Tweets Based on Machine Learning ApproachesData Mining Approaches for Big Data and Sentiment Analysis in Social Media10.4018/978-1-7998-8413-2.ch008
Article
Automatic Bug Classification System to Improve the Software Organization Product PerformanceInternational Journal of Sociotechnology and Knowledge Development10.4018/IJSKD.310066

Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques

Karthikeyan T., Karthik Sekaran, Ranjith D., Vinoth Kumar V., Balajee J M

Source Title: International Journal of Web Portals (IJWP)11(2)

ISSN: 1938-0194|EISSN: 1938-0208|EISBN13: 9781522565192|DOI: 10.4018/IJWP.2019070103

Cite Article Cite Article

MLA

Karthikeyan T., et al. "Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques." IJWP vol.11, no.2 2019: pp.41-52. http://doi.org/10.4018/IJWP.2019070103

APA

Karthikeyan T., Sekaran, K., Ranjith D., Vinoth Kumar V., & Balajee J M. (2019). Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques. International Journal of Web Portals (IJWP), 11(2), 41-52. http://doi.org/10.4018/IJWP.2019070103

Chicago

Karthikeyan T., et al. "Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques," International Journal of Web Portals (IJWP) 11, no.2: 41-52. http://doi.org/10.4018/IJWP.2019070103

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Web scraping is a technique to extract information from various web documents automatically. It retrieves the related contents based on the query, aggregates and transforms the data from an unstructured format into a structured representation. Text classification becomes a vital phase to summarize the data and in categorizing the webpages adequately. In this article, using effective web scraping methodologies, the data is initially extracted from websites, then transformed into a structured form. Based on the keywords from the data, the documents are classified and labeled. A recursive feature elimination technique is applied to the data to select the best candidate feature subset. The final data-set trained with standard machine learning algorithms. The proposed model performs well on classifying the documents from the extracted data with a better accuracy rate.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Personalized Content Extraction and Text Classification Using Effective Web Scraping Techniques

MLA

APA

Chicago

Export Reference

Abstract

Request Access