Features from the xrg dungeons

This series of articles aims to guide you through a set of pending features for OpenERP. Some of them are experimental, some more mature, some need to contain their maturing process... (read more)

Wednesday, October 17, 2012

ERP for an enterprize

In response to a scalability question, this is the moderately long answer:

Introduction

First, let me state that I don't believe in the word "enterprize", because it has traditionally been abused in the IT world to mean all sorts of mostly negative things, like "enterprize" as in "cost you a fortune", "enterprize" as in "absurd and complicated" etc.

Apart from that, we can stick to the literal meaning of an "enterprize" which may have 5 aspects:
  1. a large number of users
  2. a large number of requests and/or data traffic
  3. a large database, in millions of records
  4. an extensive set of operations, procedures, workflows
  5. a strict set of protocols (of deployment, security, manageability) and SOPs
 

1. Number of users

Here, no truly open source software really suffers, because there is no licensing on the number of users (beware of offers that limit you on that! ). We just put a 32-bit number on the UID and let you have as many as you like.

This will hold as long until the point that too many users request something from your server, which takes us to point 2.

2. Number of requests/resources

We have all sorts of physical or soft limitations to the number of requests (and their speed) you can serve at any time.
So, you may have 10k users idling around with just a login session, but the actual trouble begins when 1000 of them decide that they want page X, now. Requests vary in nature, in the time they require your CPU and I/O to be processed.
Do remember that, by default, Postgres will only serve some hundred of connections. You might want to increase this limit, to the cost of RAM which will be reserved for the db. Other system limitations to look for is the open files, number of processes and, of course, always the available RAM.

There is two sides on scaling this thing up: you can add more hardware (on a single or distributed - load balanced - system) or you can resolve the performance curlpits to let the application run lighter.

3. Large database

Our database, Postgres, doesn't practically have any hard limits on the number of records or so. [ well, it does, but they are sky-high! ]

However, a large db will bring up all sorts of performance problems, when sub-optimal queries are used. For example, computing a set of 500 accounts with 5-10M of accounting entries would cause unacceptable delays in the application's responsiveness.

4. Operations, Workflows

There, we talk about the complexity of the ERP schema, and the implications of using it in a extended deployment.
A good ERP will be flexible, and scalable, meaning that it will allow your IT team (you have one, don't you? ) configure it and adapt to your company's complex needs.
How much would it cost to add a form? Or re-route a workflow? Or implement some custom data connector to your legacy systems?

5. Protocols, deployment

A little different from point 4., the deployment has to do with IT rules that you have chosen to follow for all your enterprize software.
Would you blindly download something from the cloudy Internetz and use it in your production servers? Would you just "place the files there" and hope it runs? Could you survive a vague release schedule, and/or distribution methods?

Conclusion

Over the years, I've tried to address as many of the above points as possible, in several complementing ways:
  • extensive optimizations and profiling of the ERP
  • debugging, more debugging and counter-measures against bugs
  • making the framework more developer-friendly, more easy to hack and adapt to any different needs
  • keeping *always* some strict design principles, to ensure that the final product will be deployable in enterprize environments, will have proper release schedules and smooth migrations.
  • adding hooks for extension, so that a production server can be amended even in a "hot" production-critical setup.
That's what powers F3, as a matter of fact.

Monday, February 27, 2012

F3, a little closer

Conceived: 2009
Implemented: 2009-2011

An open-source project cannot live, nor thrive under a stubborn governance of a single vendor. No matter how nice the sales figures may be, the diminishing  community participation in OpenERP has always been a concern.

Plan A has been to convince the heads of OpenERP SA that open-source is their greatest asset, and would only remain so if they learnt to respect the open-source principles.

Now, we are at plan B, which is to evolve this project in a community-led governance model. So far (because I'm not a marketing guy), the name is "F3" as in "3rd feature branch".
Apart from a wide set of improvement patches that have been developed against 6.0 in the "pg84" and "pg84-next" branches (the 2nd feature branches), the goals of "F3" are, in short:
  • to have an open project, where all contributions will be treated equally (technically bad ones will still remain out!) and all contributors will be awarded for their effort
    Note: this means we could have more than one branch, different "series" of the project depending on debated decisions. It is nice to have plurality, rather than a SVN-style "trunk" and stick with bad decisions.
  • to remain compatible with previous codebase, respect the work all of us have done against previous versions. Also, respect our customers which have had wide installations and integrators that invested in them.
  • to have scheduled, predictable releases. A sane release cycle with feature windows, testing periods, release + integration hold-off time. Features shall be predefined, too.
  • to be stable. To be business-minded (ERP users expect a rock-solid system, not marketing fireworks).
  • to be frank. Open-source is supposed to be free of marketing b**hit. No demo-quality features, only ones that can really stand in production. If we have a bug, we admit it and fix it.
  • to be efficient. Both at the final product (pg84-next is already >40% faster than 6.0/6.1) and the way we develop it (right tools, debugging, tests, communication)
So far, some of you may have discovered the branches of "F3" series that are already out. That code is just the beginning, and a proof that this project is alive. More important are all these flags in my mailbox, "lost" commits that all of you had done against 6.0 and got ignored in the bzr series.

A set of design innovations and technical features of F3 is yet to be written (you know, documentation is that last task we tend to avoid :S ).

Why not Tryton? I get this question often. It has to do with the codebase, which departed from the Tiny API, back then, and would require major porting (and lack of features) from any current TinyERP/OpenERP installation. Still, their work and community model is highly appreciated.

In June 2010, in G-R, I said I would make OpenERP (6.0, then) a better product. My plain word is stronger than some who don't honor their promises (see: release date, features+support, payments).

Let the code speak for itself.

Tuesday, October 18, 2011

open at f3

Conceived: Late 2009
Implemented: 2009-2011
Commit: ff4fc336fd638b6b16a5445d252770efc62cb84f (current head, more to come)

Sunday, September 18, 2011

Status of Buildbot@pefnos

Conceived: June 2010
Implemented: July 2011

As some of you may know, a new, shiny buildbot is running now at my home premises (so far). During the last 2-3 weeks, it has been in "production", yet improving every day, with bugfixes for every new dark corner discovered.

Short features of the new buildbot (version 3):
  • All configuration is now inside the OpenERP db. Only connection credentials at master/slave.
  • Fully factored, also meaning:
  • Multi-VCS: already supports Git, Bzr, pending implementation of SVN, too
  • Multi-repository: supports range of repositories, easily configurable. Takes care of local proxying, so that bandwidth is saved.
  • Forked repos support: thanks to a usage case by Sharoon Thomas, "forked" repositories are supported, where they may share commits with their parent one (and siblings).
  • Easy branch configuration
  • Mirroring engine: with some easy steps, setup and maintain Git<->bzr mirroring (SVN coming, too) . Also, mirroring data stays inside the OpenERP db, ensuring that we can preserve them reliably.
  • Multi-project: no longer limited to OpenERP test builds. It can now do any kind of builds/tests
  • Async. commands: with a click on the OpenERP GUI, actions are triggered in the buildbot, in real time.
  • Scalable, fast: ported to exploit the pg84-next features, it is much faster now, design to pull heavier loads of branches on a humble (Atom(tm) currently) machine.
  • Multi-builders/slaves: multiple slaves support, for trusted/untrusted builds, Mageia, Fedora, Debian distro hosts, remote buildslaves etc.
Status update

Some may have noticed that in from 13 Sept. - 17 Sept., no "addons" mirror or build processes had run. This was intentional. These builders had been put offline (not polling) because a new algorithm of verifying the bzr->git process was being tested. It is highly important that the mirroring process is double-checked, so that we can trust the git copy to be identical to the Bzr source. Also, it just happened that the "trunk 6.1 addons" received a merge sprint at the same time, which stressed the mirroring (at some point, the intermediate fast-export file was 2.5GB, eeek!)

Now, process looks good and verification (so far) doesn't indicate inconsistencies.
New branches (from community) have been added, and may soon appear at the "extra-addons" tests.


Note: I will not publish the address of the server here. My upstream bandwidth is limited and wish not to use all of it for page views. Hopefully, this infrastructure will soon move to some better site with enough bw for all of you to enjoy.

Tuesday, August 9, 2011

Reverse-RPC

Status: Testing
Conceived: July 2011, based on a2billing 2007 technology
Implemented: July-August 2011
Commit-id: 7ae8b4f81e4aacad at "addons"

The RPC enhancements saga continues...

... now, in the reverse direction: pass commands from the server to the client. Say, why ever need to do that?

Take, for example, the openerp-buildbot. It is an openerp client connected to a command database. But it is a smart, autonomous bot that processes builds, tasks. Sometimes, the OpenERP database needs to send a request to the buildbot, like "please, start the pending builds" or even "reconfigure yourself, my database-stored layout has changed".

We want that a command is sent from the OpenERP server to a connected client, containing a payload
of data (command arguments), and hopefully executed in real time.

Note: this protocol is not really guaranteed to deliver real-time results. It only promises best-effort flow
But, still, we don't want to define a different RPC protocol, a socket opened from the server to the client. We stick with the existing Net-RPC or HTTP implementations, from client to server.

The idea is trivial: we push all requests into a table (ORM model), and let clients pop them, execute and send the result back. This way, we can keep a closely monitored track of execution flow, and keep a asynchronous design. It's the same technology I'd used in A2Billing v2 notification (aka. alarms) feature.
 By using Koo's "subscription" protocol, the clients to that "commands" table can wake in real time and process their pending tasks. Python's meta-programming capabilities also help expose a virtual RPC-like API to both the server (which issues commands) and the client.

Example (at the server, taken from buildbot):

bc_obj = self.pool.get('base.command.address')
proxy = bc_obj.get_proxy(cr, uid, 'software_dev.buildbot:%d' % (bbid))
proxy.triggerMasterRequests()
Client code:
class MasterPoller:

   @call_with_master
   def triggerMasterRequests(self, master):
        d = master.pollDatabaseBuildRequests()
        return d

Thursday, June 16, 2011

RPC-JSON, FTW!

Status: RFC, beta
Gitweb: http://git.hellug.gr/?p=xrg/openerp-sandbox
Conceived: winter 2010
Implemented: June 2011



It's here, I said I would make it. :)


And it's not even part of the server. Just an addon that extends the supported HTTP protocols. At the same ports (namely 8069 and 8071). The client (library) and the server will now transparently negotiate JSON instead of XML-RPC. Using the same stack, same authentication classes (modular) as XML-RPCv2, same dispatchers, we get rid of the slow XML in favor of JSON marshalling of our RPC payloads.


In an attempt to prepare for the future, a RESTful approach has been chosen for the implementation of the RPC-JSON protocol. So, a 3rd-party client could enjoy URLs that are mapped to the ORM objects or the server's export services.
Note that the name is RPC-JSON rather than JSON-RPC. It has to be a different name, since this implementation is not strictly a vanilla JSON-RPC one. We are compatible, but not limited to a strict JSON-RPC specification. 
For example, even with a plain browser, you can issue:
GET /json/orm/db-name/res.partner/read?0=123
and fetch the data of res.partner[123] in a JSON packet. (you will be asked for http authentication, of course)

In some tests carried out, RPC-JSON seems to significantly reduce the CPU time needed at both the server and clients, regarding the RPC communication. However, the size of data has just fallen from 2.05MB to 1.61MB, because gzip encoding optimizes both cases to a matching level (note that the old XML-RPCv1 protocol, uncompressed, is 4.68MB).

Monday, April 18, 2011

OpenERP client Library

Status: beta
Gitweb: http://git.hellug.gr/?p=xrg/openerp-libcli
Clone URL: http://members.hellug.gr/xrg/repos/openerp-libcli
Conceived: summer 2010
Implemented: March 2011

The title shall be enough.
One of the goals is to include all the protocol magic in one library, so that client implementations need no more worry about that. Also, non-Python client libraries could use this one as a reference.
So far, Net-RPC, XML-RPCv1, XML-RPCv2 are supported, with Pyro code also pasted in (but not tested, expect it to burst in flames).


Python 2.7.1
>>> from openerp_libclient import rpc
>>> rpc.openSession(proto="http", host='localhost', port='8169', user="admin", passwd="admin", superpass="admin", dbname="test_bqi")
>>> rpc.login()

1
>>> proxy = rpc.RpcProxy('res.partner')
>>> print proxy.read([1])

[{'comment': False, 'ean13': False, 'date': False, 'id': 1, 'city': 'Gerompont', 'user_id': False, 'title': False, 'company_id': [1, 'OpenERP S.A.'], 'parent_id': False, 'employee': False, 'ref': False, 'email': False, 'vat': False, 'website': False, 'customer': True, 'bank_ids': [], 'child_ids': [], 'supplier': False, 'address': [1], 'active': True, 'lang': 'en_US', 'credit_limit': False, 'name': 'OpenERP S.A.', 'phone': '(+32).81.81.37.00', 'mobile': False, 'country': [20, 'Belgium'], 'events': [], 'category_id': []}]
>>> print proxy.read([1], fields=['name', 'date'])
[{'date': False, 'id': 1, 'name': 'OpenERP S.A.'}]
>>>