Update May 17, 2017: Python packages NumPy, SciPy, and matplotlib are now optimized with Intel's Math Kernel Library for better performance in the current release of ActivePython. Test out the latest enhancements--download ActivePython 2.7 or 3.5 Community Edition.
With the rise in visibility of the extensive use of Python in Finance driven by the recent SEC proposal to require that most asset-backed securities issuers file a python computer program to model and document the flow of funds (or waterfall) provisions of the transaction, we thought it timely to ask the "must-have" Python packages for finance would be, so we asked our financial pythonistas to tell us what they thought in a short survey on the topic.
Numpy, SciPy and matplotlib head up the list and are obvious "must-haves" for finance. The ever-popular MySQL database's python binding remains a popular choice. As those financial front-office types are loathe to give up Excel, and insist on their financial datasets being collected in Excel spreadsheets, xlrd made the list as a necessity.
Here are the survey results in order of their top choice rankings:
- NumPy - the fundamental library needed for scientific and financial computing with Python as it contains a powerful N-dimensional array object, advanced array slicing methods, convenient array reshaping methods and libraries with numerical routines for basic linear algebra functions, basic Fourier transforms and sophisticated random number capabilities.
- SciPy - a suite of scientific tools for Python. It depends on the NumPy library, and it gathers a variety of high level science and engineering modules together as a single package. SciPy provides modules for statistics, optimization, numerical integration, linear algebra, Fourier transforms, signal processing, image processing, genetic algorithms, ODE solvers and special functions
- matplotlib - A numerical plotting library that provides production quality 2-D numerical plotting functionality in a variety of hardcopy formats and interactive environments across platforms.
- MySQL for Python - A pure Python binding for MySQL, allowing the user to integrate MySQL execution into any Python script.
- PyQT -a popular Python binding of the cross-platform GUI toolkit Qt.
- xlrd - Library for developers to extract data from Microsoft Excel spreadsheet files (also see xlwt)
- RPy2 - RPy2 is a simple Python interface for R, able to execute any R function from within a Python script.
- NetworkX - This tool is used for analyzing network data
- SymPy - SymPy contains nearly all of the same functionality (algebraic evaluation, differentiation, expansion, complex numbers, etc.) as SimPy, but is contained in a pure Python distribution.
- Boost.Python - This C++ library enables seamless interoperability between C++ and Python (see)
- PyMC - PyMC implements the Metropolis-Hastings algorithm as a Python class, providing flexibility when building your model. PyMC is also highly extensible, and well supported by the community.
- SimPy - Short for “Simulation in Python”, an object-oriented, process-based discrete-event simulation language, making it a wholesale agent-based modeling environment written entirely in Python.
- Pycluster - This package contains efficient implementations of hierarchical and k-means clustering..
- NLTK - Natural Language Toolkit is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language.
- An honorable mention should go to wxpython, a set of Python extension modules that wrap the cross-platform GUI classes from wxWidgets that received a number of write-in votes.
If you are interested in using Python as a platform for financial analysis, modeling or other research, I should also note that most of these packages come standard with the Community Edition of ActivePython - free to use and supported by the community.
As well, NumPy, SciPy, & matplotlib are now available for Python 2.6 on linux-x86_64, macosx, linux-x86, and win32-x86 for ActivePython Business Edition, Enterprise Edition, and OEM customers. These distributions are recommended for use in business- or mission-critical applications.
The Survey Polls are still open, to cast your vote, click here
Many thanks to Drew Conway's excellent post that inspired this survey
For more information on the SEC's proposal, visit http://www.sec.gov