Stealing pages from the server...

I train machine to train model.
01
25
Simplest way to Build Web Crawler Simplest way to Build Web Crawler
A web crawler, sometimes called a spiderbot or scraper, is an internet bot that systematically browses the net. We can get the information we need without copy-paste. The goal of this article is to let you know how I scrape web and store it into database or csv file.
2021-01-25
22
Train Word2Vec Model on WSL Train Word2Vec Model on WSL
In this article, I'm going to build my own pre-trained word embedding on WSL, which stands for Windows Subsystem for Linux, and it is a compatibility layer for running Linux binary executables (in ELF format) natively on Windows 10.. The reason why I train the model on Linux instead of Windows is that it's not user-freiendly to run C++ and some other packages on Windows.
2021-01-22
12
31
Set Up Anaconda for Python Set Up Anaconda for Python
Recently, python is getting more popular, because it can complete a project in a short time. However, setting up virtual environment is crucial for programming several projects. In this article, I will introduce how I setting up a anaconda environment for python.
2020-12-31
15
11
28
TensorFlow 2.0 Installation TensorFlow 2.0 Installation
TensorFlow makes people love and hate. It is an end-to-end open source platform for Machine Learning and Deep Learning. However, I always have trouble with installing TensorFlow a bunch of times. Thus I decide to share my experience in order to help others to solve this same problem.
2020-11-28
27
EDA for Predicting Insurance Claim EDA for Predicting Insurance Claim
Exploratory Data Analysis (EDA) is understanding the data sets by summarizing their main characteristics often plotting them visually. This step is very important especially when we arrive at modeling the data in order to apply Machine learning. In this article, I'll show you how I did for this!
2020-11-27
10
16
Collect Tweets using Twint Collect Tweets using Twint
Twint is a Python-based advanced Twitter scraping app that allows you to scrape Tweets from Twitter profiles without having to use Twitter's API. Twint makes use of Twitter's search operators to allow you to scrape Tweets from specific individuals, scrape Tweets referring to specific themes, hashtags, and trends, and sort out sensitive information like e-mail and phone numbers from Tweets. This is something I find quite handy, and you can get fairly creative with it as well.
2020-10-16
04
15
QS Ranking Crawler QS Ranking Crawler
This article aims to build a web scraper by using BeautifulSoup and Selenium, and scrape QS Rankings to discover the top universities from all over the world. "Uni name", "ranking" and "location" are fetched from the table and stored as a csv file. Jupyter notebook is available as well through my GitHub.
2020-04-15
07
10
Sentiment Analysis for KKBOX Sentiment Analysis for KKBOX
This sentiment classification task is based on reviews data of UtaPass and KKBOX from Google Play platform. As a KKStreamer at KKBOX, I become more interested in Natural Language Processing, especially text classification. First, I start crawling the text data using web crawler technique, namely BeautifulSoup and Selenium. Second, I develop several different neural network architectures, including simple RNN, LSTM, GRU, and CNN, to name but a few, to detect the polarity of reviews from customers.
2019-07-10
06
11
Categorising Song Genre by Analysing Lyrics Categorising Song Genre by Analysing Lyrics
The ability to classify music in an automated manner has become increasingly more important with the advent of musical streaming services allowing greater access to music. Spotify alone hit 100 million users in 2016, with other services provided by companies such as Apple, Soundcloud and YouTube. In addition, there are huge numbers of professional musicians, approximately 53,000 in the USA alone, as well as amateurs who are producing music which needs to be classified. With this quantity of music, it is unfeasible to classify genres without an automated method.
2019-06-11
12
03
5 / 5