Data Science/Big Data/Machine Learning

  • pandas (data analysis)
  • NumPy (multi-dimensional arrays)
  • SciPy (algorithms to use with numpy)
  • matplotlib (data visualization)
  • HDF5 (store & manipulate data)
  • PyTables (managing HDF5 datasets)
  • Jupyter (research collaboration)
  • IPython (powerful shell)
  • HDFS (C/C++ wrapper for Hadoop)
  • pymongo (MongoDB driver)
  • SQLAlchemy (Python SQL Toolkit)
  • redis (Redis access libraries)
  • pyMySQL (MySQL connector)
  • scikit-learn (machine learning algorithms)
  • TensorFlow (deep learning with neural networks)*
  • theano (deep learning with neural networks)
  • keras (high-level neural networks API)
  • bokeh (data visualization)
  • seaborn (data visualization)
  • dask (data engineering)
  • airflow (data engineering)
  • luigi (data engineering)
  • elasticsearch (data search engine)

Cloud and Web Application Development

  • Django (web framework)
  • flask (web framework – microservices)
  • tornado (web framework and networking)
  • requests (web dev library)
  • AWS SDK (Amazon cloud)
  • google-cloud (Google Cloud)
  • simplejson (json library)
  • Twisted (asynchronous networking)
  • urllib3 (HTTP w/ connection pooling)
  • jinja2 (template engine)
  • s3transfer (AWS S3)

Code Quality/Testing

  • pytest (testing)
  • nose (testing)
  • selenium (testing)
  • flake8 (code quality)
  • coverage (test coverage)
  • Developer Utilities
  • pytz (time zone library)
  • PyYAML(YAML support)
  • py (code gen, API control, ini file parsing)
  • lxml (processing XML/HTML)
  • cffi (C code interface)


  • cryptography (recipes and primitives)
  • pyOpenSSL (python interface to OpenSSL)
  • passlib (password hashing)
  • requests-oauthlib (Oauth support)
  • ecdsa (cryptographic signature)
  • PyCryptodome (PyCrypto replacement)
  • service_identity (prevents pyOpenSSL man-in-the-middle attacks)

