Numpy:科学计算
HOME: http://www.numpy.org/
NumPy is the fundamental package for scientific computing with Python
科学计算库, Python的一种开源的数值计算扩展, Numpy内部解除了Python的PIL(全局解释器锁),运算效率极好,是大量机器学习框架的基础库.
Pandas:科学计算库,基于Numpy
HOME: http://pandas.pydata.org/
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
Matplotlib:绘图
HOME: https://matplotlib.org/
Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.
Seaborn:高级绘图
HOME: http://seaborn.pydata.org/
statistical data visualization
Seaborn is a library for making statistical graphics in Python. It is built on top of matplotlib and closely integrated with pandas data structures.
sklearn:机器学习
HOME: http://scikit-learn.org
from sklearn.model_selection import train_test_splitfrom sklearn.ensemble import RandomForestRegressorfrom sklearn.metrics import mean_absolute_error
scikit-learn Machine Learning in Python
-
Simple and efficient tools for data mining and data analysis
-
Accessible to everybody, and reusable in various contexts
-
Built on NumPy, SciPy, and matplotlib
-
Open source, commercially usable - BSD license
jieba,gensim,WordCloud:文本分析