Tom Radcliffe's Blogs

Tom Radcliffe has over 20 years experience in software development, data science, machine learning, and management in both academia and industry. He is a professional engineer (PEO and APEGBC) and holds a PhD in physics from Queen's University at Kingston. Tom brings a passion for quantitative, data-driven processes to ActiveState. He is deeply committed to the ideas of Bayesian probability theory, and assigns a high Bayesian plausibility to the idea that putting the best software tools in the hands of the most creative and capable people will make the world a better place.

  • Python vs C++ for Text Processing
    Rendering large amounts of text fast via blprnt.blg

    Text will always be a major format for data and it will never be well-organized. According to Phil Karlton’s famous joke, the two hard problems in computing are naming things, cache invalidation, and off-by-one errors. A third problem could be “formatting things”. Most textual data has irregularities in the formatting that make it a pain to process. And much of the work in text processing goes into dealing with formatting issues. These are just the sad facts.

  • Accelerating Your Algorithms: Considerations in Design, Algorithm Choice and Implementation
    Accelerating Your Algorithms

    The pursuit of speed is one of the few constants in computing, and it is driven by two things: ever-increasing amounts of data, and limited hardware resources. Today in the era of Big Data and the Internet of Things we are getting pinched in both directions. Fortunately we are getting much better at distributed and parallel computing, but the need for raw speed at the algorithmic level is never going to go away. If we can make our algorithms inherently faster we will get more out of our expensive hardware, and that is always going to be a good thing.

  • Robust Algorithms for Machine Learning
    Robust Algorithms for Machine Learning

    Machine learning is often held out as a magical solution to hard problems that will absolve us mere humans from ever having to actually learn anything. But in reality, for data scientists and machine learning engineers, there are a lot of problems that are much more difficult to deal with than simple object recognition in images, or playing board games with finite rule sets.

  • Pandas: Framing the Data
    Python Pandas Logo

    Data science, and numerical computing, in general, has a problem: the deep linear algebra libraries deal with pure numbers in vectors and matrices, but in the real world there is always metadata attached to those structures that needs to be carried along through the computational pipeline. Rows and columns have information attached to them--names, typically--that has to be accounted for even as we do things like remove rows or swap data around to make certain computations more tractable.

  • Python vs. Ruby: Which is best for web development?
    Python vs Ruby: What is Best for Web Development

    Python and Ruby are among some of the most popular programming languages for developing websites, web-based apps, and web services.

    In many ways, the two languages have a lot in common. Visually they are quite similar, and both provide programmers with high-level, object-oriented coding, an interactive shell, standard libraries, and persistence support. However, Python and Ruby are worlds apart in their approach to solving problems because their syntax and philosophies vary greatly, primarily because of their respective histories.

  • Python and Tables for (Fairly) Big Data
    Big Data

    Big Data is big these days, as more and more companies dig into their servers to find out what makes their market tick.

    There is "big", and then there is "BIG", however. Really big data--multi-terabyte-scale--is still fairly rare. If you're working at that scale then Hadoop MapReduce or possibly Spark is required.

  • Go for Object-Oriented Developers
    Go for Object-Oriented Developers

    Software design is about representation: how do we represent the solution to a problem in code that can be executed on the machine of our choice? How do we represent the problem domain to the user? The software design problem is not inherently different from the problem of expression in any language, formal or informal.

  • Lua: Not Your Average Scripting Language
    Lua Logo

    Lua is not your average scripting language. It is small, fast, portable, and embeddable. These cornerstones of the language make it well-suited in many facets of the software industry, including embedded devices, video games, and even web applications. The language has been embraced by NASA, Adobe, NGINX, and Mozilla--to name a few.

  • Tipping Point in Open Source Security
    Open Source Security Tipping Point

    In the past couple of years we have seen some of the biggest security issues in open source, including Heartbleed, Shellshock, and POODLE. More recently, in March of this year, we faced the cross-protocol security bug DROWN. The impact of these security holes has been far-reaching--millions of users and sites were affected.

  • Functional Python
    Functional Python

    Functional programming is a discipline, not a language feature. It is supported by a wide variety of languages, although those languages can make it more or less difficult to practice the discipline. Python has a number of features that support functional programming, including map/reduce functions, partial application, and decorators.