This system is similar to one I designed and helped implement for Det Norske Veritas (DNV) named OHRAT in Scotland 25 years ago. OHRAT executed programs one after the other. This system executes in executes programs in parallel, and is written for Windows rather than Unix. Designed and written by Bentley Beal.

Execute Linked SOA Objects In Parallel

Passing Information Between Objects

Utilizing A Data Flow Graph

San Diego, California

Email - VestpocketSoftware@gmail.com


Desktop Tools
Development Framework
SOA Object Designer
I/O Object Editor
Batch Runner
Report Writer
Web Based Tools
Development Framework
SOA Object Designer
I/O Object Editor
Batch Runner
Report Writer
Installers and Servers
Installer
Servers
Examples
Desktop Postal Merge Example
Web Postal Merge Example


Everyone knows that programs running in parallel run faster
than the same programs running one at a time

but not everyone has software that can run programs in parallel



Visual Programming Language Data Flow HPC Cloud Middleware
  • The easy to use software for running many ordinary programs on a network of computers in parallel. Up until now HPC (High Performance Computing) Clusters have been 'geek friendly' and generally unavailable on Windows based computers. Things have changed. Our software has a simple easy to use graphical user interface that allows you to draw the solution to complex problems, and use computers you rent for a short period of time from a cloud computer provider, or computers you already own to run the Windows programs you already have in a parallel fashion. You can have computing power that rivals the largest supercomputers available only when you need it for a price any organization can afford.
  •  
    In a business office a complex project may be broken down into tasks that are assigned to a number of office workers. This speeds up the completion of the project by distributing the work. This system is similar in the way that it operates, but uses computers instead of managers and office workers to perform tasks. Separate parts of a complex problem are solved in parallel on networked computers that interact with each other using a data flow paradigm. Service Oriented Architecture (SOA) objects are actually ordinary programs enclosed in a special 'wrapper' that allows them to interact with the system in a completely consistent way. The input and output for each SOA object are stored as objects in a database. Each SOA object performs a particular computational service. The SOA objects and the data flow between them can be organized by the user into any arrangement necessary to solve a problem.

    Because the SOA objects execution time is usually much greater than the overhead involved in preparing an object for execution and storing the result, the overhead of parallel execution is not considered an important factor. Experiments have shown this to be a fact. The longer the execution time or 'coarse grained' a problem is the more insignificant the parallel overhead becomes. Great increases in the speed of solving problems can be achieved by running portions of a problem in parallel. In a test of a problem with 63 SOA objects running on 1 machine with a CPU speed of 2.8 GHz the total run time was 236 seconds. The same problem using 4 machines each having a CPU speed of 1 GHz exhibited a total run time of 22 seconds.

    You may configure the computers in your organization so that they can be used during the day for their normal work, and be used at night as part of your organization's own hpc cloud. No one will ever be aware they are being used this way at night because no trace is left on the machines the next morning. Configuring your computers this way will also not harm their performance in any way or use any disk space on the machines for the hpc clouds data.

    A second option is to  rent 'virtual computers' from a cloud computer service provider, run your projects at a remote location, and pay for only the time you use on the remote computers.
    In either case your input data and output results are stored as objects in your local database and are not accessible to anyone outside your organization.
Parallel Data Flow Development Framework

Web Parallel Data Flow Development Framework



    This web site describes the Parallel Data Flow Development Framework and the Web Parallel Data Flow Development Framework we have developed for creating libraries of 'coarse grained' SOA objects, executing data flow directed graphs constructed of these SOA objects while passing data between them, and displaying the results of their calculations. The SOA objects in the libraries are somewhat like objects in libraries used in object oriented programming languages. These objects can be combined in any way the user desires as long as the result is an acyclic directed graph. The system for creating collections, referred to as projects, of SOA objects can be thought of as being something like a graphical programming language. Ordinary programs, perhaps previously written in C, C++, VB, FORTRAN, etc., are turned into 'SOA objects' by 'wrapping' them in a predefined input output structure.

    The development framework that is used to design projects is completely separate from the programs, the servers, that run the projects, and they are written in different computer languages. They only share a common data base of projects and data, and they only communicate with each other through messages sent through TCP/IP sockets. The entire system that runs projects could be completely replaced without altering the part that is used to design the projects.

    A SOA Object Designer tool is included to help the user define an objects input output structure called a wrapper, and for generating a 'viewable' input and output form for each SOA object. The individual input and output objects in a wrapper can also be designed and edited separately using the Input/Output Object Editor tool. Data flow between SOA objects is accomplished using pipes, and the SOA Object Designer provides a code generator to generate templates that eliminate the need to write the message handling, parsing, and piping code for a SOA object. The user is only responsible for writing the code that creates the interface between the program and its 'wrapper'. Writing this interface is usually quite simple and involves three steps. The first step is to write code to move the values from the arrays and values defined by the wrapper into the corresponding internal arrays and values of the program, and execute the computational part of the program. The second step is to write the code to move the internal results of the calculation after the programs completion into the corresponding wrapper output arrays and values, and to call the wrappers normal execution exit. The third step is to eliminate all of the code that deals with the display of error messages and premature termination, and replace it with code that stores error messages in the wrappers final message area and calls the wrappers premature termination exit. The 'main' of a typical program usually calls three methods, one to copy the inputs in, one to do the calculations, and one to copy the outputs and terminate the program. A fourth method may be called to store an error message and abnormally terminate the program. If the user wants to access the inputs and outputs directly, they may do so and thus eliminate the need for copying. This is often done when a new program is written and there is no need to 'wrap' an old program. 'Wrapping' an existing program is tedious but not particularly hard to do. A simple well structured program with no GUI may take a few minutes to 'wrap'. A complex somewhat poorly structured program with a few possible errors and a properly partitioned GUI may take a day or two. A very complex very poorly structured program with many possible errors and complex poorly partitioned GUI may take many weeks to 'wrap'.

    General purpose objects like IF objects, CASE objects, LOOP objects, etc. are also provided. Loops are ordinarily not possible in a acyclic directed graph. A LOOP object generates a stream of objects, and a stream of objects can actually cause something resembling a loop to occur in an acyclic graph. In fact a 'loop' can be executed in parallel by segmenting the stream of objects and sending the segments to separate workers. See the Postal Merge example for an explanation of parallel segmentation.

    In addition a Network Installer tool is included to help the user install the system on their network worker computers. This tool sets up the environment variables on computers that are intended to be part of the parallel computer network. After using the 'test connection' option to test the connection to the master server the 'upload option' can be used to upload the current version of the software from the master server. After a worker computer is set up it can be started and it will automatically connect to the master server. It may be the case that computers that are normally used for work during the day are to be used to form a hpc cloud at night when they are not being used for other purposes. In that case the computers are left running when the employees go home. The master server can then send the worker computers a message to' wake up and come to work'. The worker computers will start their worker servers and 'report for duty'. All of the computers that 'report in' can then be used to solve large problems in parallel. When the employees come back to work the master server can send a 'go home' message to the worker computers. They will shut down their worker servers and the employees will never be aware that they were being used as part of a hpc cloud while they were away. No data is stored on the worker computers so the employees will not be able to access any computational results and no disk space will be consumed on their machines. Having the parallel server software installed on a computer does not impact it's performance.

    We have been thinking of adapting the framework we have developed to generate interfaces and run projects using existing super computer software. Hadoop, Amazon Elastic Compute Cloud, Google App Engine, and Right Scale are being considered as possible 'back ends'. This approach would eliminate the need for using our master servers and worker servers.
  • Software for parallel computers to quickly answer seemingly impossibly difficult queries. This approach is not based on the 'sifting approach' of data mining, but a different idea which has its roots in set theory combined with some of the indexing the techniques used in internet search engines. This interest has evolved from a theoretical state to a practical solution state, and we use a parallel computer network to implement the concept.
Services
  • Design and Implementation of Cloud Computer Applications. We work with customers to determine the suitability of running their applications in a cloud computer environment. If they are a good fit, we help them to convert their applications to run as cloud computer applications. In the event they want to development new cloud computer applications, we assist them through all phases of the development cycle to produce fast and reliable cloud computer applications
  • Design of complex Object Oriented Databases. We are experts in object oriented databases, and have designed the databases for a number of commercial products that are currently being used by major corporations. We specialize in designing database systems that have unusual and difficult requirements, such as the ability to easily adapt to rapid change without requiring changes in the structure of the database or reprogramming, databases that have notes and change history logging on individual objects, databases that have versions of objects and the ability to see and work with different versions of objects at the same time, databases that have cut and paste facilities, etc. As a side effect, the software that we design and implement is typically simpler, smaller, and easier to maintain compared to conventional designs. Some implementations have only 10% of the lines of code required by comparable 'brute force' designs. We are especially experienced in database designs that are complex networks of objects. We are not limited by conventional 'row and column table oriented relational database thinking', and can therefore solve database problems that others would find difficult or impossible to solve.
Non technical description of the Parallel Data Flow HPC Cloud Software

The system resembles data flow in a Business Office . The master server resembles an office manager in this metaphor, and the worker computers resemble clerical office workers. The console resembles a senior manager who designs projects, and delivers the projects to the office manager to be completed. The office manager breaks down each project into groups of jobs called sub projects. Each sub project must be completed one at a time because each sub project is dependent upon the successful completion of the previous sub project. The office manager distributes the jobs in a sub project into the 'in baskets' of the workers that are under utilized also taking into account the speed and skills of the workers during the distribution process. The workers work on the jobs in their 'in baskets' in parallel, not paying any attention at all to what the other workers are doing. The office manager picks up jobs that have been completed from the workers 'out baskets' and checks off the jobs in a sub project as completed or not completed because of a serious error. When all the jobs in a sub project are complete, the next sub project is ready to be worked on and the jobs in that sub project are distributed to the workers. When all the sub projects in a project are complete, the project is finished. When a project is finished, the office manager informs the senior manager, and the senior manager can inspect the results.  If a serious error in a project is encountered, the project is stopped and the senior manager is notified. After the senior manager corrects the error, the project can be restarted at the point where the error occurred.

In a business office system the objective of the office manager is to efficiently break down and distribute jobs, and to make sure there are no empty 'in baskets'. If all the 'in baskets' have something in them at all times the system is operating at maximum efficiency and the workers are 100% busy. The objectives are the same for the parallel network system. The system is working efficiently if all of the worker computers are busy 100% of the time and no worker computer is either under utilized or over utilized.

You simply apply by entering your former (old) and your new email address.  If you wish, you can turn on a security option when you apply that will prevent unauthorized access to your new email address. I only recommend this for the few that really don't want just anyone to know their new email address. There is a security process you have to go through to register so that unauthorized individuals cannot get control of your new email address. After you are registered, people type in your old address, press the 'Go' button, and your new email address appears. If you sign up with the security option on, you can turn it off and on, you can enter the addresses of people who you want to know your new email address, you can screen people who you want to know your new email address, etc..