NLP

String manipulation with Python (for NLP)

f-strings Concept The official name of f-strings is “formatted string literal.” f-string is a “modern” way to put values of variables into the strings (so fat in Feb. 2020.) Before fstring appears, we were using format method. For me, fstring is very intuitive than format method. Formatted string literals - Python official document format - Python official document Simple example Both print lines in following code print the string “It’s me, Mario”.

NLP tools in Python

Libraries I need a few libraries for NLP and each of them are very powerful. I downloaded all of these libraries via pip, like pip install -U {package}. In the last section, I summaraized the libraries and I can install them at once later. spaCy: Open source NLP library. NLTK: Natural Language ToolKit. It is older than spaCy (spaCy 2015~, NLTK 2001~). gensim: NLP tools. I installed it for Doc2Vec. TensorFlow: For custom models of machine learning including Keras.

Regular expressionexp - snippets

Regular expression Regular expression is a basic concept in theoretical computer science. Once you see the Wikipedia page of “Regular expression”, you can realize how important it is for understanding computer science. But for beninner of web engineer, the simple explanation of regular expression could be, it is just a “pattern” in a nut shell. Regular expression is often abbreviated to regex. Regex rules (To be updated…) Here is often used regex syntax.