Python is a relative newcommer to the data handling scene. However it is rapidly gaining ground for a number of reasons. Firstly, it's open source and it's free. The low barrier to entry means that it encourages experimentation and contributions to its development. A wide range range of mature, high quality libraries are now available. The programmer is able to develop sophisticated applications relatively quickly by reusing the work of others. Among others, modules exist to provide database connectivity, statistical analysis, quality graphics and web app frameworks.
Python is an interpreted language - this means, all things being equal, it will not run as quickly as a compiled language. However all things are rarely equal. Python excels in two roles: firstly, the straight forward syntax, and range of available libraries, make it particularly suited to rapidly building ETL systems - the flow control handled by python, directing the data operations of a database back end. Since the heavy lifting is being done by the database, the speed penalty is negligible. Secondly, for data mining, research and statistical analysis, the wide range of libraries means it is usually possible to find on the shelf a ready built algorithm with exactly the functionality you need without having to code from scratch. The saving in development time will often outweigh any loss in execution time. Where speed is critical, Python can manage flow control, while data intensive operations are handed off to faster languages (e.g. C++, Java). Many Python libraries do in fact call C++ routines giving the python programmer the best of both worlds.
Python is at it's core an imperative language however it supports functional programming, object-oriented programming and libraries can be built to support the declarative paradigm. It's flexibility is its strength and accounts for its widespread and growing popularity.
The Numpy library deserves a particular mention. This library supports vector and array based operations which have been carefully optimised for speed. Native array handling makes it particularly suited to expressing mathematically oriented problems. When properly deployed, the result is fast, succint, expressive code. Further speed gains can often be gained by using the Numba library alongside Numpy. Some example analyses with code...
We can help you with:
- simulation and modelling with NumPy, Sklearn and SciPy
- building your ETL system using Python as a backbone
- converting algorithms from other languages into Python
- data mining and statistical analysis using Python
- integrating data from multiple systems
- integrating code in multiple languages using Python
- creating reports, process monitoring
- python training
We can help you expand an existing system, port your system to Python, or build a new system from scratch.
Previous clients have been in the Banking, Retail, Marketing and Government sectors.
Contact us to discuss your project.