2010 Conference Schedule

Day 1

09:30-10:30Perry GreenfieldKeynote: How Python Slithered into Astronomy
10:30-10:45Tea Break
10:45-11:30Fernando PerezSpecial Talk: IPython : Beyond the Simple Shell
11:30-11:50Farhat HabibPython as a Platform for Scientific Computing Literacy for 10+2 Students: Weighing the Balance
11:50-12:10Jayesh GandhiMicrocontroller experiment and its simulation using Python
12:10-12:40Vaidhy MayilrangamNatural Language Processing Using Python
12:40-13:10Georges KhaznadarLive media for training in experimental sciences
14:10-14:20Shubham ChakrabortyUse of Python and Phoenix-M interface in Robotics
14:20-14:30Erroju Rama KrishnaSimplified and effective Network Simulation using ns-3
14:30-14:40More Lightning Talks
14:40-15:10Asokan PichaiInvited Talk: Teaching Programming with Python
15:10-15:30Hemanth ChandranPerformance Evaluation of HYBRID MAC for 802.11ad: Next Generation Multi-Gbps Wi-Fi using SimPy
15:30-15:50Karthikeyan selvarajPyCenter
15:50-16:10Tea Break
16:10-16:40Satrajit GhoshInvited Talk: Nipype: Opensource platform for unified and replicable interaction with existing neuroimaging tools
16:40-17:00Nek SharanParallel Computation of Axisymmetric Jets
17:00-17:20pankaj pandeyPySPH: Smooth Particle Hydrodynamics with Python

Day 2

09:00-10:00John HunterSpecial Talk: matplotlib: Beyond the simple plot
10:00-10:45Prabhu RamachandranInvited Talk: Mayavi : Bringing Data to Life
11:00-11:45Stéfan van der WaltInvited Talk: In Pursuit of a Pythonic PhD
11:45-12:15Dharhas PothinaHyPy & HydroPic: Using python to analyze hydrographic survey data
12:15-12:35Prashant AgrawalA Parallel 3D Flow Solver in Python Based on Vortex Methods
12:35-13:05Ajith KumarPython in Science Experiments using Phoenix
14:05-14:15HarikrishnaPython based Galaxy workflow integration on GARUDA Grid
14:15-14:25Arun C. H.Automation of an Optical Spectrometer
14:25-14:35More Lightning Talks
14:35-14:55Krishnakant ManeConvincing Universities to include Python
14:55-15:15Shantanu Choudhary"Python" Swiss army knife for Prototyping, Research and Fun.
15:15-15:35Puneeth ChagantiPictures, Songs and Python
15:35-15:55Hrishikesh DeshpandeWavelet based denoising of ECG using Python
16:10-16:40Jarrod MillmanInvited TalkBuilding an open development community for neuroimaging analysis
16:40-17:00Ramakrishna Reddy YekullaBuilding and Packaging your Scientific Python Application For Linux Distributions
17:00-17:20Yogesh KarpateAutomatic Proteomic Finger Printing using Scipy
17:20-17:40Manjusha JoshiSAGE for Scientific computing and Education enhancement

Invited Talks

How Python Slithered into Astronomy

Perry Greenfield

Talk/Paper Abstract

I will talk about how Python was used to solve our problems for the Hubble Space Telescope. From humble beginnings as a glue element for our legacy software, it has become a cornerstone of our scientific software for HST and the next large space telescope, the James Webb Space Telescope, as well as many other astronomy projects. The talk will also cover some of the history of essential elements for scientific Python and where future work is needed, and why Python is so well suited for scientific software.

IPython : Beyond the Simple Shell

Fernando Perez

Talk/Paper Abstract

IPython is a widely used system for interactive computing in Python that extends the capabilities of the Python shell with operating system access, powerful object introspection, customizable "magic" commands and many more features. It also contains a set of tools to control parallel computations via high-level interfaces that can be used either interactively or in long-running batch mode. In this talk I will outline some of the main features of IPython as it has been widely adopted by the scientific Python user base, and will then focus on recent developments. Using the high performance ZeroMQ networking library, we have recently restructured IPython to decouple the kernel executing user code from the control interface. This allows us to expose multiple clients with different capabilities, including a terminal-based one, a rich Qt client and a web-based one with full matplotlib support. In conjunction with the new HTML5 matplotlib backend, this architecture opens the door for a rich web-based environment for interactive, collaborative and parallel computing. There is much interesting development to be done on this front, and I hope to encourage participants at the sprints during the conference to join this effort.

Teaching Programming with Python

Asokan Pichai

Talk/Paper Abstract

As a trainer I have been engaged a lot for teaching fresh Software Engineers and software job aspirants. Before starting on the language, platform specific areas I teach a part I refer to as Problem Solving and Programming Logic. I have used Python for this portion of training in the last 12+years. In this talk I wish to share my experiences and approaches. This talk is intended at Teachers, Trainers, Python Evangelists, and HR Managers [if they lose their way and miraculously find themselves in SciPy :-)]

matplotlib: Beyond the simple plot

John Hunter

Talk/Paper Abstract

matplotlib, a python package for making sophisticated publication quality 2D graphics, and some 3D, has long supported a wide variety of basic plotting types such line graphs, bar charts, images, spectral plots, and more. In this talk, we will look at some of the new features and performance enhancements in matplotlib as well as some of the comparatively undiscovered features such as interacting with your data and graphics, and animating plot elements with the new animations API. We will explore the performance with large datasets utilizing the new path simplification algorithm, and discuss areas where performance improvements are still needed. Finally, we will demonstrate the new HTML5 backend, which in combination with the new HTML5 IPython front-end under development, will enable an interactive Python shell with interactive graphics in a web browser.

Mayavi : Bringing Data to Life

Prabhu Ramachandran

Talk/Paper Abstract

Mayavi is a powerful 3D plotting package implemented in Python. It includes both a standalone user interface along with a powerful yet simple scripting interface. The key feature of Mayavi though is that it allows a Python user to rapidly visualize data in the form of NumPy arrays. Apart from these basic features, Mayavi has some advanced features. These include, automatic script recording, embedding into a custom user dialog and application. Mayavi can also be run in an offscreen mode and be embedded in a sage notebook ( We will first rapidly demonstrate these key features of Mayavi. We will then discuss some of the underlying technologies like enthought.traits, traitsUI and TVTK that form the basis of Mayavi. The objective of this is to demonstrate the wide range of capabilities that both Mayavi and its underlying technologies provide the Python programmer.

Nipype: Opensource platform for unified and replicable interaction with existing neuroimaging tools

Satrajit Ghosh

Talk/Paper Abstract

Current neuroimaging software offer users an incredible opportunity to analyze their data in different ways, with different underlying assumptions. However, this has resulted in a heterogeneous collection of specialized applications without transparent interoperability or a uniform operating interface. Nipype, an open-source, community-developed initiative under the umbrella of Nipy, is a Python project that solves these issues by providing a uniform interface to existing neuroimaging software and by facilitating interaction between these packages within a single workflow. Nipype provides an environment that encourages interactive exploration of neuroimaging algorithms from different packages, eases the design of workflows within and between packages, and reduces the learning curve necessary to use different packages. Nipype is creating a collaborative platform for neuroimaging software development in a high-level language and addressing limitations of existing pipeline systems.

In Pursuit of a Pythonic PhD

Stéfan van der Walt

Talk/Paper Abstract

In May of 2005, I started a pilgrimage to transform myself into a doctor of engineering. Little did I know, then, that my journey would bring me in touch with some of the most creative, vibrant and inspiring minds in the open source world, and that an opportunity would arise to help realise their (and now my) dream: a completely free and open environment for performing cutting edge science. In this talk, I take you on my journey, and along the way introduce the NumPy and SciPy projects, our community, the early days of packaging, our documentation project, the publication of conference proceedings as well as work-shops and sprints around the world. I may even tell you a bit about my PhD on super-resolution imaging!

Building an open development community for neuroimaging analysis

Jarrod Millman

Talk/Paper Abstract

Programming is becoming increasingly important to scientific activity. As its importance grows, the need for better software tools becomes more and more central to scientific practice. However, many fields of science rely on badly written, poorly documented, and insufficiently tested codebases. Moreover, scientific software packages often implement only the approaches and algorithms needed or promoted by the specific lab where the software was written.

In this talk, I will illustrate this situation by discussing some of the weaknesses of the software ecosystem for neuroimaging analysis circa 2004. I will then describe how several of my colleagues and I are attempting to rectify this situation with a project called Neuroimaging in Python ( Specifically, I will discuss the approach we've taken (e.g., using Python) and the lessons we've learned.

Submitted Talks

Python as a Platform for Scientific Computing Literacy for 10+2 Students: Weighing the Balance

Farhat Habib

Talk/Paper Abstract

The use of Python as a language for introducing computing is becoming increasingly widespread. Here we report out findings from two years of running an introduction to computing course with Python as the programming language, and building upon it, using SciPy as a scientific computing language in a course on scientific computing.

The course is designed as a general computing course for introducing computing to first year undergraduate students of science. We find that a large majority of our incoming students have no prior exposure to programming and none of the students had any exposure to Python. Thus, the design of the course is such that it allows everybody to be brought up to speed with general programming concepts. Later, the students will later specialize in varied topics from Biology to pure Mathematics, thus, the course emphasizes general computing concepts over specialized techniques. At a second course in Scien- tific Computing numerical methods are introduced with the aid of Scipy. The introduction to computing course has been taught twice in Fall 2009 and 2010 to batches of around 100 students each. In this paper we report our experience with teaching Python and student and faculty feedback related to the course.

Usb Connectivity Using Python

Arun C. H.

Talk/Paper Abstract

Host software using Python interpreter language to communicate with the USB Mass Storage class device is developed and tested. The usic18F4550.pyd module encapsulating all the functions needed to configure USB is developed. The Python extension .pyd using C/C++ functions compatible for Windows make use of SWIG, distutils and MinGW. SWIG gives the flexibility to access lower level C/C++ code through more convenient and higher level languages such as Python, Java, etc. Simplified Wrapper and Interface Generator (SWIG) is a middle interface between Python and C/C++. The purpose of the Python interface is to allow the user to initialize and configure USB through a convenient scripting layer. The module is built around libusb which can control an USB device with just a few lines. Libusb-win32 is a port of the USB library to the Windows operating system. The library allows user space applications to access any USB device on Windows in a generic way without writing any line of kernel driver code. A simple data acquisition system for measuring analog voltage, setting and reading the status of a particular pin of the micro controller is fabricated. It is interfaced to PC using USB port that confirms to library USB win32 device. The USB DAQ hardware consists of a PIC18F4550 micro-controller and the essential components needed for USB configuration.

Automation of an Optical Spectrometer

Arun C. H.

Talk/Paper Abstract

This paper describes the automation performed for an Optical Spectrometer in order to precisely monitor angles, change dispersing angle and hence measure wave length of light using a data logger, necessary hardware and Python. Automating instruments through programs provides great deal of power, flexibility and precision. Optical Spectrometers are devices which analyze the wave length of light, and are typically used to identify materials, and study their optical properties. A broad spectrum of light is dispersed using a grating and the dispersed light is measured using a photo transistor. The signal is processed and acquired using a data logger. Transfer of data, changing angle of diffraction are all done using the Python. The angle of diffraction is varied by rotating the detector to pick up lines using a stepper motor. The Stepper motor has 180 steps or 2 degrees per step. A resolution of 0.1 degree is achieved in the spectrometer by using the proper gear ratio. The data logger is interfaced to the computer through a serial port. The stepper motor is also interfaced to the computer through another serial port. Python is chosen here for its succinct notation and is implemented in a Linux environment.

"Python" Swiss army knife for Prototyping, Research and Fun.

Shantanu Choudhary

Talk/Paper Abstract

This talk would be covering usage of Python in different scenarios which helped me through my work:

Wavelet based denoising of ECG using Python

Hrishikesh Deshpande

Talk/Paper Abstract

The python module "RemNoise" is presented. It allows user to automatically denoise one-dimensional signal using wavelet transform. It also removes baseline wandering and motion artifacts. While RemNoise is developed primarily for biological signals like ECG, its design is generic enough that it should be useful to applications involving one-dimensional signals. The basic idea behind this work is to use multi-resolution property of wavelet transform that allows to study non-stationary signals in greater depth. Any signal can be decomposed into detail and approximation coefficients, which can further be decomposed into higher levels and this approach can be used to analyze the signal in time-frequency domain. The very first step in any data-processing application is to pre-process the data to make it noise-free. Removing noise using wavelet transform involves transforming the dataset into wavelet domain, zero out all transform coefficients using suitable thresholding method and reconstruct the data by taking its inverse wavelet transform. This module makes use of PyWavelets, Numpy and Matplotlib libraries in Python, and involves thresholding wavelet coefficients of the data using one of the several thresholding methods. It also allows multiplicative threshold rescaling to take into consideration detail coefficients in each level of wavelet decomposition. The user can select wavelet family and level of decompositions as required. To evaluate the module, we experimented with several complex one-dimensional signals and compared the results with equivalent procedures in MATLAB. The results showed that RemNoise is excellent module to preprocess data for noise-removal.

HyPy & HydroPic: Using python to analyze hydrographic survey data

Dharhas Pothina

Talk/Paper Abstract

The Texas Water Development Board(TWDB) collects hydrographic survey data in lakes, rivers and estuaries. The data collected includes single, dual and tri-frequency echo sounder data collected in conjunction with survey grade GPS systems. This raw data is processed to develop accurate representations of bathymetry and sedimentation in the water bodies surveyed.

This talk provides an overview of how the Texas Water Development Board (TWDB) is using python to streamline and automate the process of converting raw hydrographic survey data to finished products that can then be used in other engineering applications such as hydrodynamic models, determining lake elevation-area-capacity relationships and sediment contour maps, etc.

The first part of this talk will present HyPy, a python module (i.e. function library) for hydrographic survey data analysis. This module contains functions to read in data from several brands of depth sounders, conduct anisotropic interpolations along river channels, apply tidal and elevation corrections, apply corrections to boat path due to loss of GPS signals as well as a variety of convenience functions for dealing with spatial data.

In the second part of the talk we present HydroPic, a simple Traits based application built of top of HyPy. HydroPic is designed to semi-automate the determination of sediment volume in a lake. Current techniques require the visual inspection of images of echo sounder returns along each individual profile. We show that this current methodology is slow and subject to high human variability. We present a new technique that uses computer vision edge detection algorithms available in python to semi-automate this process. HydroPic wraps these algorithms into a easy to use interface that allows efficient processing of data for an entire lake.

Parallel Computation of Axisymmetric Jets

Nek Sharan

Talk/Paper Abstract

Flow field for imperfectly expanded jet has been simulated using Python for prediction of jet screech frequency. This plays an important role in the design of advanced aircraft engine nozzle, since screech could cause sonic fatigue failure. For computation, unsteady axisymmetric Navier-Stokes equation is solved using fifth order Weighted Essentially Non-Oscillatory (WENO) scheme with a subgrid scale Large-Eddy Simulation (LES) model. Smagorinsky’s eddy viscosity model is used for subgrid scale modeling with second order (Total Variation Diminishing) TVD Runge Kutta time stepping. The performance of Python code is enhanced by using different Cython constructs like declaration of variables and numpy arrays, switching off bound check and wrap around etc. Speed up obtained from these methods have been individually clocked and compared with the Python code as well as an existing in-house C code. Profiling was used to highlight and eliminate the expensive sections of the code.

Further, both shared and distributed memory architectures have been employed for parallelization. Shared memory parallel processing is implemented through a thread based model by manual release of Global Interpreter Lock (GIL). GIL ensures safe and exclusive access of Python interpreter internals to running thread. Hence while one thread is running with GIL the other threads are put on hold until the running thread ends or is forced to wait. Therefore to run two threads simultaneously, GIL was manually released using "with nogil" statement. The relative independence of radial and axial spatial derivative computation provides an option of putting them in parallel threads. On the other hand, distributed memory parallel processing is through MPI based domain decomposition, where the domain is split radially with an interface of three grid points. Each sub-domain is delegated to a different processor and communication, in the form of message transmission, ensures update of interface grid points. Performance analyses with increase in number of processors indicate a trade-off between computation and communication. A combined thread and MPI based model is attempted to harness the benefits from both forms of architectures.

Simplified and effective Network Simulation using ns-3

Erroju Rama Krishna

Talk/Paper Abstract

Network simulation has great significance in the research areas of modern networks. The ns-2 is the popular simulation tool which proved this, in the successive path of ns-2 by maintaining the efficiency of the existing mechanism it has been explored with a new face and enhanced power of python scripting in ns-3. Python scripting can be added to legacy projects just as well as new ones, so developers don't have to abandon their old C/C++ code libraries, but in the ns-2 it is not possible to run a simulation purely from C++ (i.e., as a main() program without any OTcl), ns-3 does have new capabilities (such as handling multiple interfaces on nodes correctly, use of IP addressing and more alignment with Internet protocols and designs, more detailed 802.11 models, etc.)

In ns-3, the simulator is written entirely in C++, with optional Python bindings. Simulation scripts can therefore be written in C++ or in Python. The results of some simulations can be visualized by nam, but new animators are under development. Since ns-3 generates pcap packet trace files, other utilities can be used to analyze traces as well.

In this paper the efficiency and effectiveness of IP addressing simulation model of ns-3 is compared with the ns-2 simulation model,ns-3 model consisting of the scripts written in Python which makes the modeling simpler and effective


Karthikeyan selvaraj

Talk/Paper Abstract

The primary objective is defining a centralized testing environment and a model of testing framework which integrates all projects in testing in a single unit.

The implementation of concurrent processing systems and adopting client server architecture and with partitioned server zones for environment manipulation, allows the server to run test requests from different projects with different environment and testing requests. The implementation provides features of auto-test generation, scheduled job run from server, thin and thick clients.

The core engine facilitates the management of tests from all the clients with priority and remote scheduling. It has an extended configuration utility to manipulate test parameters and watch dynamic changes. It not only acts as a request pre-preprocessor but also a sophisticated test bed by its implementation. It is provided with storage and manipulation segment for every registered project in the server zone. The system schedules and records events and user activities thereby the results can be drilled and examined to core code level with activates and system states at the test event point.

The system generates test cases both in human readable as well as executable system formats. The generated tests are based on a pre-defined logic in the system which can be extended to adopt new cases based on user requests. These are facilitated by a template system which has a predefined set of cases for various test types like compatibility, load, performance, code coverage, dependency and compliance testing. It is also extended with capabilities like centralized directory systems for user management with roles and privileges for authentication and authorization, global mailer utilities, Result consolidator and Visualizer.

With the effective implementation of the system with its minimal requirements, the entire testing procedure can be automated with the testers being effectively used for configuring, ideating and managing the test system and scenarios. The overhead of managing the test procedures like environment pre-processing, test execution, results collection and presentation are completely evaded from the testing life cycle.

Live media for training in experimental sciences

Georges Khaznadar

Talk/Paper Abstract

A system for distance learning in the field of Physics and Electricity has been used for three years with some success for 15 years old students. The students are given a little case containing a PHOENIX box (see featuring electric analog and digital I/O interfaces, some unexpensive discrete components and a live (bootable) USB stick.

The PHOENIX project was started by Inter University Accelerator Centre in New Delhi, with the objective of improving the laboratory facilities at Indian Universities, and growing with the support of the user community. PHOENIX depends heavily on Python language. The data acquisition, analysis and writing simulation programs to teach science and computation.

The hardware design of PHOENIX box is freely available.

The live bootable stick provides a free/libre operating system, and a few dozens educational applications, including applications developed with Scipy to drive the PHOENIX box and manage the acquired measurements. The user interface has been made as intuitive as possible: the main window shows a photo of the front face of the PHOENIX acquisition device, its connections behaving like widgets to express their states, and a subwindow displays in real time the signals connected to it. A booklet gives general-purpose hints for the usage of the acquisition device. The educational interaction is done with a free learning management system.

The talk will show how such live media can be used as powerful training systems, allowing students to access at home exactly the same environment they can find in the school, and providing them a lot of structured examples.

This talk addresses people who are involved in education and training in scientific fields. It describes one method which allows distance learning (however requiring a few initial lessons to be given non-remotely), and enables students to become fluent with Python and its scientific extensions, while learning physics and electricity. This method uses Internet connections to allow remote interactions, but does not rely on a wide bandwidth, as the complete learning environment is provided by the live medium, which is shared by teacher and students after their beginning lessons.

Use of Python and Phoenix-M interface in Robotics

Shubham Chakraborty

Talk/Paper Abstract

In this paper I will show how to use Python programming with a computer interface such as Phoenix-M to drive simple robots. In my quest towards Artificial Intelligence (AI) I am experimenting with a lot of different possibilities in Robotics. This one is trying to mimic the working of a simple insect's autonomous nervous system using hard wiring and some minimal software usage. This is the precursor to my advanced robotics and AI integration where I plan to use an new paradigm of AI based on Machine Learning and Self Consciousness via Knowledge Feedback and Update process.

Python in Science Experiments using Phoenix

Ajith Kumar

Talk/Paper Abstract

Phoenix is a hardware plus software framework for developing computer interfaced science experiments. Sensor and control elements connected to Phoenix can be accessed using Python. Text based and GUI programs are available for several experiments. Python programming language is used as a tool for data acquisition, analysis and visualization.

Objective of the project is to improve the laboratory facilities at the Universities and also to utilize computers in a better manner to teach science. The hardware design is freely available. The project is based on Free Software tools and the code is distributed under GNU General Public License.

Building and Packaging your Scientific Python Application For Linux Distributions

Ramakrishna Reddy Yekulla

Talk/Paper Abstract

If you are an Independent Researcher, Academic Project or an Enterprise software Company building large scale scientific python applications, there is a huge community of packagers who look at upstream python projects to get those packages into upstream distributions. This talk focuses on practices, making your applications easy to package so that they can be bundled with Linux distributions. Additionally this talk would be more hands on, more like a workshop. The audience are encouraged to bring as many python applications possible, using the techniques showed in the talk and help them package it for fedora.

Microcontroller experiment and its simulation using Python

Jayesh Gandhi

Talk/Paper Abstract

Electronics in industrial has been passing through revolution due to extensive use of Microcontroller. These electronic devices are having a high capability to handle multiple events. Their capability to communicate with the computers has made the revolution possible. Therefore it is very important to have trained Personnel in Microcontroller. In the present work experiments for study of Microcontrollers and its peripherals with Simulation using Python is carried out. This facilitates the teachers to demonstrate the experiments in the classroom sessions using simulations. Then the same experiments can be carried out in the labs (using the same simulation setup) and the microcontroller hardware to visualize and understand the experiments. Python is selected due to its versatility and also to promote the use of open source software in the education.

Here we demonstrate the experiment of driving seven segment displays by microcontroller. Four seven segment displays are interfaced with the microcontroller through a single BCD to seven segments Display Decoder/Driver (74LS47) and switching transistors. The microcontroller switches on the first transistor connected to the first display and puts the number to be displayed on 74LS47. Then it pause a while, switches off the first display and puts the number to be displayed on the second display and switches it on. A similar action is carried out for all the display and the cycle is repeated again and again. Now we can control the microcontroller action using the serial port of the computer through python. Simulating the seven segment display using VPYTHON module and communicating the same action to the microcontroller, we can demonstrate the switching action of the display at a very slow rate. It is possible to actually see each display glowing individually one after another. Now we can gradually increase the rate of switching the display. You see each display glowing for a few milliseconds. Finally the refresh rate is taken very high to around more than 25 times a second we see that all the display glowing simultaneously.

Hence it is possible to simulate and demonstrate experiments and understand the capabilities of the microcontroller with a lot of ease and at a very low cost.

SAGE for Scientific computing and Education enhancement

Manjusha Joshi

Talk/Paper Abstract

Sage is Free open source software for Mathematics.

Sage can handle long integer computations, symbolic computing, Matrices etc. Sage is used for Cryptography, Number Theory, Graph Theory in education field. Note book feature in Sage, allow user to record all work on worksheet for future use. These worksheets can be publish for information sharing, students and trainer can exchange knowledge, share, experiment through worksheets.

Sage is an advanced computing tool which can enhance education in India.

Automatic Proteomic Finger Printing using Scipy

Yogesh Karpate

Talk/Paper Abstract

The idea is to demonstrate the PyProt (Python Proteomics), an approach to classify mass spectrometry data and efficient use of statistical methods to look for the potential prevalent disease markers and proteomic pattern diagnostics. Serum proteomic pattern diagnostics can be used to differentiate samples from the patients with and without disease. Profile patterns are generated using surface-enhanced laser desorption and ionization (SELDI) protein mass spectrometry. This technology has the potential to improve clinical diagnostic tests for cancer pathologies. There are two datasets used in this study which are taken from the FDA-NCI Clinical Proteomics Program Databank. First data is of ovarian cancer and second is of Premalignant Pancreatic Cancer .The Pyprot uses the high-resolution ovarian cancer data set that was generated using the WCX2 protein array. The ovarian cancer dataset includes 95 controls and 121 ovarian cancer sets, where as pancreatic cancer dataset has 101 controls and 80 pancreatic cancer sets. There are two modules designed and implemented in python using Numpy , Scipy and Matplotlib. There are two different kinds of classifications implemented here, first to classify the ovarian cancer data set. Second type focuses on randomly commingled study set of murine sera. it explores the ability of the low molecular weight information archive to classify and discriminate premalignant pancreatic cancer compared to the control animals.

A crucial issue for classification is feature selection which selects the relevant features in order to focus the learning search. A relaxed setting for feature selection is known as feature ranking, which ranks the features with respect to their relevance. Pyprot comprises of two modules; First includes implementation of feature ranking in Python using fisher ratio and t square statistical test to avoid large feature space. In second module, Multilayer perceptron (MLP) feed forward neural network model with static back propagation algorithm is used to classify .The results are excellent and matched with databank results and concludes that PyProt is useful tool for proteomic finger printing.

Natural Language Processing Using Python

Vaidhy Mayilrangam

Talk/Paper Abstract

The purpose of this talk is to give a high-level overview of various text mining techniques, the statistical approaches and the interesting problems.

The talk will start with a short summary of two key areas – namely information retrieval (IR) and information extraction (IE). We will then discuss how to use the knowledge gained for summarization and translation. We will talk about how to measure the correctness of results. As part of measuring the correctness, we will discuss about different kinds of statistical approaches for classifying and clustering data.

We will do a short dive into NLP specific problems - identifying sentence boundaries, parts of speech, noun and verb phrases and named entities. We will also have a sample session on how to use Python’s NLTK to accomplish these tasks.

A Parallel 3D Flow Solver in Python Based on Vortex Methods

Prashant Agrawal

Talk/Paper Abstract

A 3D flow solver for incompressible flow around arbitrary 3D bodies is developed. The solver is based on vortex methods whose grid-free nature makes it very general. It uses vortex particles to represent the flow-field. Vortex particles (or blobs) are released from the boundary, and these advect, stretch and diffuse according to the Navier-Stokes equations.

The solver is based on a generic and extensible design. This has been made possible mainly by following a universal theme of using blobs in every component of the solver. Advection of the particles is implemented using a parallel fast multipole method. Diffusion is simulated using the Vorticity Redistribution Technique (VRT). To control the number of blobs, merging of nearby blobs is also performed.

Each component of the solver is parallelized. The boundary, advection and stretching algorithms are based on the same parallel velocity algorithm. Domain decomposition for parallel velocity calculator is performed using Space Filling Curves. Diffusion, which requires knowledge of each particle's neighbours, uses a parallelized fast neighbour finder which is based on a bin data structure. The same neighbour finder is used in merging also.

The code is written completely in Python. It is well-documented and well-tested. The code base is around 4500 lines long. The design follows an object oriented approach which makes it extensible enough to add new features and alternate algorithms to perform specific tasks.

The solver is also designed to run in a parallel environment involving multiple processors. This parallel implementation is written using mpi4py, an MPI implementation in Python.

Rigorous testing is performed using Python's unittest module. Some standard example cases are also solved using the present solver.

In this talk we will outline the overall design of the solver and the algorithms used. We discuss the benefits of Python and also some of the current limitations with respect to parallel testing.

Performance Evaluation of HYBRID MAC for 802.11ad: Next Generation Multi-Gbps Wi-Fi using SimPy

Hemanth Chandran

Talk/Paper Abstract

Next generation Wireless Local Area Networks (WLAN) is targeting at multi giga bits per second throughput by utilizing the unlicensed spectrum available at 60 GHz, millimeter wavelength (mmwave).Towards achieving the above goal a new standard namely the 802.11ad is under consideration. Due to the limited range and other typical characteristics like high path loss etc., of these mmwave radios the requirement of the Medium Access Control (MAC) are totally different.

The conventional MAC protocols tend to achieve different objectives under different conditions. For example, the (Carrier Sense Multiple Access / Collision Avoidance) CSMA/CA technique is robust and simple and works well in overlapping network scenarios. It is also suitable for bursty type of traffic. On the other hand CSMA/CA is not suitable for power management since it needs the stations to be awake always. Moreover it requires an omni directional antenna pattern for the receiver which is practically not feasible in 60 GHz band.

A Time Division Multiple Access (TDMA) based MAC is efficient for Quality of Service (QoS) sensitive traffic. It is also useful for power saving since the station knows their schedule and can therefore power down in non scheduled periods.

For 60 GHz usages especially applications like wireless display, sync and go, and large file transfer, TDMA appears to be a suitable choice. Whereas for applications that require low latency channel access (e.g. Internet access etc.)TDMA appears to be inefficient due to the latency involved in bandwidth reservation.

Another choice is the polling MAC which is highly efficient for the directional communication in the 60 GHz band. This provides an improved data rates with directional communication as well as acts as an interference mitigation scheme. On the contrary polling may not be efficient for power saving and also not efficient to take advantage of statistical traffic multiplexing. This technique also leads to wastage of power due to polling the stations without traffic to transmit.

Having the above facts in mind and considering the variety of applications involved in the next generation WLAN systems operating at 60 GHz, it can be concluded that no individual MAC scheme can support the traffic requirements.

In this paper we use SimPy to do a Discrete Event Simulation modeling of a proposed hybrid MAC protocol which dynamically adjusts the channel times between contention and reservation based MAC schemes, based on the traffic demand in the network.

We plan to model the problem of admission control and scheduling using DES using SimPy. SimPy v2.1.0 is being used for the simulation purposes of the proposed Hybrid MAC. We are new to using Python for scientific purposes and have just begun using this powerful tool to get meaningful and useful results. We plan to share our learning experience and how SimPy is increasingly becoming a useful tool (apart from regular modeling tools like Opnet / NS2).

PySPH: Smooth Particle Hydrodynamics with Python

pankaj pandey

Talk/Paper Abstract

We present a python/cython implementation of an SPH framework called PySPH. SPH (Smooth Particle Hydrodynamics) is a numerical technique for the solution of the continuum equations of fluid and solid mechanics.

PySPH was written to be a tool which requires only a basic working knowledge of python. Although PySPH may be run on distributed memory machines, no working knowledge of parallelism is required of the user as the same code may be run either in serial or in parallel only by proper invocation of the mpirun command.

In PySPH, we follow the message passing paradigm, using the mpi4py python binding. The performance critical aspects of the SPH algorithm are optimized with cython which provides the look and feel of python but the performance near to that of a C/C++ implementation.

PySPH is divided into three main modules. The base module provides the data structures for the particles, and algorithms for nearest neighbor retrieval. The sph module builds on this to describe the interactions between particles and defines classes to manage this interaction. These two modules provide the basic functionality as dictated by the SPH algorithm and of these, a developer would most likely be working with the sph module to enhance the functionality of PySPH. The solver module typically manages the simulation being run. Most of the functions and classes in this module are written in pure python which makes is relatively easy to write new solvers based on the provided functionality.

We use PySPH to solve the shock tube problem in gas dynamics and the classical dam break problem for incompressible fluids. We also demonstrate how to extend PySPH to solve a problem in solid mechanics which requires additions to the sph module.

Pictures, Songs and Python

Puneeth Chaganti

Talk/Paper Abstract

The aim of this talk is to get students, specially undergrads excited about Python. Most of what will be shown, is out there on the Open web. We just wish to draw attention of the students and get them excited about Python and possibly image processing and may be even cognition. We hope that this talk will help retain more participants for the tutorials and sprint sessions.

The talk will have two parts. The talk will not consist of any deep research or amazing code. It's a mash-up of some weekend hacks, if they could be called so. We reiterate that the idea is not to show the algorithms or the code and ideas. It is, to show the power that Python gives.

The first part of the talk will deal with the colour Blue. We'll show some code to illustrate how our eyes suck at blue (1), if they really do. But, ironically, a statistical analysis that we did on "Rolling Stones Magazine's Top 500 Songs of All time" (2), revealed that the occurrences of blue are more than twice the number of occurrences of red and green! We'll show the code used to fetch the lyrics and count the occurrences.

The second part of the talk will show some simple hacks with images. First, a simple script that converts images into ASCII art. We hacked up a very rudimentary algo to convert images to ASCII and it works well for "machine generated images." Next, a sample program that uses OpenCV (3) that can detect faces. We wish to show OpenCV since it has some really powerful stuff for image processing.

(1) (2) (3)

Convincing Universities to include Python

Krishnakant Mane

Talk/Paper Abstract

Python has been around for a long enough time now that it needs serious attention from the educational institutes which teach computer science. Today Python is known for its simple syntax yet powerful performance (if not the fastest performance which is any ways not needed all the time ). From Scientific computing till graphical user interfaces and from system administration till web application development, it is used in many domains. However due to Industrial propaganda leading to promotion of other interpreted languages (free or proprietary)? Python has not got the justice in educational sector which it deserves. This paper will talk on methodologies which can be adopted to convince the universities for including Python in their curriculum. The speaker will provide an insight into his experience on success in getting Python included in some Universities. A case of SNDT University will be discussed where the curriculum designers have decided to have Python in their courses from the next year. The speaker will share his ideas which led to this inclusion. these will include,

Python based Galaxy workflow integration on GARUDA Grid


Talk/Paper Abstract

Bioinformatics applications being complex problem involving multiple comparisons, alignment, mapping and analysis can be managed better using workflow solutions. Galaxy is an open web based platform developed in Python for genomic research. Python is a light weight dynamic language making Galaxy to be modular and expandable. Bioinformatics applications being compute and data intensive scale well in grid computing environments. In this paper we describe bringing the Galaxy workflow to the Garuda Grid computing infrastructure for enabling bioinformatics applications. GAURDA grid is an aggregation of heterogeneous resources and advanced capabilities for scientific applications. Here we present the integration of galaxy workflow tool with GARUDA grid middleware to enable computational biologists to perform complex problems on the grid environment through a web browser.