Ce diaporama a bien été signalé.
David I Widjaja
How to get the data?
Input from database
Topics that is made of strings
Topic Sentences (Subject)
How to create tags:
1. Get all topic sentences and split them between white space
2. Convert all words into lower case
3. Delete all numeric and duplicate values
4. Sort words alphabetically
5. Delete unnecessary words (e.g. is, the, and, etc.)
6. Search for synonym words and cluster them into a single tag
7. Translate words if necessary
8. Insert tags into main spreadsheet
A weighted graph map is used:
The larger the amount of word associated
with the tag, the bigger the bubble.
Lines get thicker according to the number
of relationship between topics.
Web Scraping on other similar websites
Take the topic sentences to be in the
subject columns. Examples:
Copy to previous spreadsheet (The one with
the pervious tags).
Do the same process as before on
the weighted graph map
Compare the two weighted graph maps
Generate Word Cloud using Python or online tools.
Microsoft Excel 2013 (Spreadsheet)
Mozilla Firefox (Browser)
Inspect Element (Search Patterns)
DownThemAll (Download HTMLs)
Total Commander (Merge HTMLs)
Notepad++ (Cleanse Data)