Python and Java in the real world

Disclaimer

If you are Java or Python evangelist or do not know basis of both programming languages then this post might be not suitable for you. This post is to show the real world test of both programming languages.

Introduction

A couple of years ago we asked Python and Java professionals to make head to head comparison of two languages in real world. The reason for that is that mostly the comparisons are simple semantics which do not take in count of language basis, development speed and programming complexity. As I found the analysis laying around in my disk I thought to wrap it up into article. I hope it will help you with your decisions. Also I asked comments from fellow Python and Java enthusiasts. What would they do and what do they think about the code programmed.

The goal

  1. Program has two input parameters: source directory and target directory (where parsed files are stored);
  2. find files files in source directory;
  3. if file is zipped- unzip and remove zip file;
  4. for each file examine first 100000 chars if they are ASCII. When end if total error chars divided with file size is bigger than 0.003 store error message to the file;
  5. zip files contained in zip or original file with *.chars and store to target directory;
  6. clean up all resources made.
  • Can use threads or processes.
  • Program must measure its running time in milliseconds
  • Parser reads 500 files
  • File sizes come different size

Coding result analysis

 Dependencies Size (bytes) Lines Dependencies
Python 2.7 45927 1568 Twisted 12.0.0
Java 6 4976 188 none

Test setup

  • 500 Zip files, total 8.2GB
  • Run 100 times, sleep one minute before each execution

7 threads

This is not a real life that only 7 processes are running, but we wanted to have comparison with low resource usage and hopefully run each program in its own CPU core.

Fastest Slowest Average CPU Memory
Python 2.7 2:37 3:05 2:43 388% 1111.18 MB (1.09 GB)
Java 6 3:08 3:22 3:14 363% 91.43 MB (0.089 GB)

 100 Threads

This is more likely the real world. 100 processes are running in parallel to give fastest result. It may end up creating farm, but when it comes to the cost, perhaps it is still better to have only one server?

FASTEST SLOWEST AVERAGE CPU MEMORY
Python 2.7 3:51 5:67 5:48 180% 2239.94 MB (2.19 GB)
Java 6 3:14 3:45 3:43 149% 1480.40 MB (1.45 GB)

Verdict

Both scripts both were able to take advantage of mutlicore processor. Java with threading and Python with operating system based processes. In the result, if there are not many threads, Python was faster to complete the code. And this is because faster disk I/O. But there was drawback in huge memory usage.

When it comes to high thread Java completed job faster and with lower memory usage. The reason of python memory usage is the lack of file streaming. On higher threads the higher memory usage was because of disk i/o but somehow Python speed went down dramatically.So, winners ares:

  • I/O Speed: Python
  • Resource usage: Java
  • Parallel processing: Java

Programmers note

When it came to factoring the code we found that fixing bugs and issues in Java is slightly easier as there are a lot of UI-s out there helping you out to find references to the code. As I mentioned above, I asked fellow programmers to do small refactory in a code they never saw before. And doing it in Java was easier than in Python.

Also the task took longer to solve in Python (this also reflects at the size of the code).

Side note

This post is not to show which language is better and why it is better, this for to test which language is more suitable in real world solutions. So, as mentioned at Python’s Comparing Python to Other Languages in real world you use Python for prototyping or as a glue language. When it comes to debate of the size of the code or speed of development, we would argue there as well. We did chose python, as it was better for full solution, but when we looked back it perhaps was not the smartest choice. In a year our servers farm did grow rapidly and development slowed down in time as code space increased. What we would do differently is that write the core in Java or C#, C++ and small portions of event handlers in JavaScript, Python, Lua, PHP or whatever language suitable.

  • Facebook
  • Twitter
  • Google
  • Google Plus
  • LinkedIn
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

Comments are closed.