CATEGORY development

Open-source projects update

I spent the last two days going over most of our OS projects, Symfony2 bundles and other libraries to fix some issues, merge pull requests and tag releases. Here is an update on all changes:

Alice – v1.5.0 – Expressive fixtures generator

  • Added extensibility features to allow the creation of a Symfony2 AliceBundle (hautelook/alice-bundle)
  • Added possibility to fetch objects by id with non-numeric ids
  • Added (local) flag for classes and objects to create value objects that should not be persisted
  • Added enums to create multiple objects (like fixture ranges but with names)
  • Added ProcessorInterface to be able to modify objects before they get persisted
  • Fixed cross-file references, everything is now persisted at once
  • Fixed self-referencing of objects

Also note that Baldur Rensch recently started working on a bundle to integrate Alice in Symfony2, you might want to check that out as well.

Monolog v1.6.0 – Logging for PHP

  • Added HipChatHandler to send logs to a HipChat chat room
  • Added ErrorLogHandler to send logs to PHP’s error_log function
  • Added NewRelicHandler to send logs to NewRelic’s service
  • Added Monolog\ErrorHandler helper class to register a Logger as exception/error/fatal handler
  • Added ChannelLevelActivationStrategy for the FingersCrossedHandler to customize levels by channel
  • Added stack traces output when normalizing exceptions (json output & co)
  • Added Monolog\Logger::API constant (currently 1)
  • Added support for ChromePHP’s v4.0 extension
  • Added support for message priorities in PushoverHandler, see $highPriorityLevel and $emergencyLevel
  • Added support for sending messages to multiple users at once with the PushoverHandler
  • Fixed RavenHandler’s support for batch sending of messages (when behind a Buffer or FingersCrossedHandler)
  • Fixed normalization of Traversables with very large data sets, only the first 1000 items are shown now
  • Fixed issue in RotatingFileHandler when an open_basedir restriction is active
  • Fixed minor issues in RavenHandler and bumped the API to Raven 0.5.0
  • Fixed SyslogHandler issue when many were used concurrently with different facilities

MonologBundle v2.4.0 – Monolog integration in Sf2

  • Added support for the console, newrelic, hipchat, cube, amqp and error_log handlers
  • Added monolog.channels config option to define additional channels
  • Added excluded_404s property to the fingers_crossed handler to avoid logging 404s matching those regex patterns
  • Added ability to set multiple user ids in the pushover handler
  • Added support for an empty dsn in raven handler

Note that as of this version, the bundle’s release cycle is de-synchronized from the framework’s. It means you can just require "symfony/monolog-bundle": "~2.4" in your composer.json and Composer will automatically pick the latest version of the bundle that works with your current version of Symfony. The minimum version of Symfony2 for this workflow is 2.3.0.

NelmioSecurityBundle v1.2.0 – Additional security features for Sf2

  • Added Content-Security-Policy (CSP) 1.0 support
  • Added forced_ssl.whitelist property to define URLs that do not need to be force-redirected
  • Fixed session loss bug on 404 URLs in the CookieSessionHandler

NelmioJsLoggerBundle v1.2.0 – JS error logging in your Sf2 monolog logs

  • Added ability to give more context information by setting window.nelmio_js_logger_custom_context

NelmioCorsBundle v1.1.0 – Cross-Origin Request Headers support for Sf2

  • Added ability to set a wildcard on accept_headers

That’s it for today, but I would like to thank everyone that was involved in either sending pull requests or reporting bugs/feature requests to make all this happen!

July 30, 2013 by Jordi Boggiano in Development, News // Tags: , , , Comments Off on Open-source projects update

Firefox OS App Day

Last weekend I attended the first Firefox OS App Day held in Switzerland. Mozilla held the event to promote their new mobile operating system to developers and have people try and build apps.

Firefox OS is a new mobile OS by Mozilla that is entirely web based and therefore quite interesting for us web developers. They are developing many new JavaScript APIs to enable developers to use all the functionalities native apps typically get access to on other devices. And by submitting them for standardization to the W3C, there is a hope that one day all these APIs will be available on all platforms, making mobile web apps an even more attractive option.

In the morning a few talks were held to introduce the ecosystem and APIs. Then after a lunch break we had a few hours to hack an application together. My idea was to take a picture with the front-facing camera whenever you get a call. That picture would then be stored as the contact’s picture so that every time they call you you see the face you made the last time. It is a bit strange but sounded like a funny experiment.

Unfortunately the APIs to access the camera and handle incoming phone calls are for now still restricted to pre-installed applications (so-called Carrier apps). This means you can develop such an app and test it on your device if it is installed via the USB cable, but you could not deploy it on the Firefox Marketplace. It also means that right now those APIs are even less documented than the rest.

I did not find any docs for the camera API at all and had to look into the camera app’s sources to reverse engineer how to work the camera. That is a clear upside of fact that the entire OS is open source and written in JavaScript. Unfortunately I never managed to get an answer from the camera, when trying to take a picture it would just hang.

The telephony API on the other hand was documented a bit, but the docs did not include which permissions the app had to request to access the telephony objects, so I lost a lot of time with a crashing app before I figured it out looking at the OS sources again.

All in all the phones and OS still feel a bit “beta” when compared to more mature platforms. But it looks a lot better than what I saw six months ago in the pre-release state so I am quite hopeful that it will become an interesting platform in the near future.

My app in its somewhat broken state is available on github. I imagine the issues with the camera and all are fixable but I do not have a phone to test it with yet so I can not really work on the app anymore. The emulator allows you to develop some types of apps but it seems the camera and telephony APIs are not supported yet.

In any case Firefox OS is fun to play with for web developers and I would recommend you give it a shot, exciting times ahead!

June 6, 2013 by Jordi Boggiano in Development // Tags: , , Comments Off on Firefox OS App Day

Generating fixtures with Alice

A common problem in software development, is that you need data to work with. This is especially true with data-oriented websites. Working with an empty database leads to all sorts of unexpected problems cropping up once the site goes live and receives real world data. Performance issues, visual bugs due to missing fields, too long or too small texts, etc.

The best way to address this is to have fixtures, fake data you use while developing or for automated testing of the website. However writing these fixtures can be a cumbersome process which leads to people postponing it and sometimes outright skip this step.

It used to be that Doctrine1 could load fixtures from yaml files, and symfony1 also had facilities there, but none of this has been made available to Doctrine2 or Symfony2 yet, or at least not in a very usable form.

Instead of writing a plugin for either of those, I set out to write a generic library to easily create objects from yaml files. It can be used with any framework or ORM, and also integrates the great Faker library to generate fake/random data. The library takes a few opinionated turns, so it might not be for everyone. However I am sure it will save you a great deal of time should you decide to work with it.

The library is called Alice, after the common Alice and Bob placeholder names, and you can of course find it on GitHub, including extensive documentation.

Let me know what you think of it!

October 29, 2012 by Jordi Boggiano in Development, News // Tags: , , 18 Comments

A python solution to a secure backup of CouchDB via replication

This guest post was written by Reto. He works at Bluevalor, Nelmio’s very first client ever. Reto is a crack at analyzing economic data using MATLAB and Python. But he can not help himself and sometimes enjoys working on our infrastructure as well.

When we started our project with Nelmio, Pierre proposed to use CouchDB as a container for highly dimensional data items we receive from our third parties. We have been happy with CouchDB so far. One of the nice features of couch is its reliance on sequence IDs to assure very easy synchronisation between different CouchDB instances. It is even possible to use these sequence IDs to set up synchronisation between say SQL and CouchDB, since there is a nice API to query for changes in the CouchDB server.

A very convenient way to set up a backup of the data is to just configure a second CouchDB on another machine and replicate the data onto that machine. There is a feature called “continuous replication”. This seems to imply that you would have to set up the replication only once… However there is quite a big drawback as of CouchDB 1.2.: If the server is restarted, the replications will not be re-initiated. Even worse, sometimes replications just break down without any apparent reason.

Update: If you set up the replication via the _replicator database it fixes the restart issue.

In short: CouchDB’s “continuous replication” is not reliable enough as a backup system.

I’ve written a small Python script that you can run as a cronjob to check if a replication exists for a list of CouchDBs. As a little bonus, I added email notification in case something is wrong, so you can sleep well knowing your CouchDB backup is still working. With this script, it should be viable to backup your CouchDB databases via replication. I’ve attached the code after my little fanboy praise of Python.

I’ve studied finance and basically taught myself programming for scientific purposes. I’m trying really hard to write good code, but sometimes, I lack experience because I do not have a true programming background. If you’d like to point out things i could do better in terms of form, structure or function, please comment!

I’ve worked extensively with MATLAB so far. However, recently I stumbled over Python as a language for scientific computing and I’m absolutely loving it, so I would like to take the opportunity to praise on Python a little:

There are various reasons to use Python for scientific computing:

  1. high level language (good productivity, easy to learn for people like me)
  2. general purpose and object oriented (can interface with everything, bigger projects possible)
  3. beautiful, easy to read syntax
  4. ability to interface with low level languages if speed is first priority
  5. very rich libraries that support scientific computing needs
  6. open source (the MATLAB commercial license is 10-20k CHF depending on toolboxes)

With the open source packages numpy, scipy, ipython and pandas, Python pretty much trumps over every other scientific toolbox (R, MATLAB, Mathematica) while remaining super easy to use.

Especially pandas (an open source library that was developed at a hedge fund – true story!) improves data handling of time series ten-fold. I truly believe that if you need to do research with time series data, Python with pandas is the future.

So if you ever run into a problem where you need to do a lot of data cleaning and wrangling, look at pandas. There is a very good book called Python for Data Analysis written by Wes McKinney, the main developer of pandas (albeit only released as an “early release”).

Now to the code: Note that you will need the couchdb library to make this code work, so either install couchdb-python in your Python folder, or simply put it into the folder of the script. Note that I’ve only tested it with Python 2.7. You need to configure the “CONFIG” part of the script, and you should be all set.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
import couchdb
import datetime
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEText import MIMEText

#-----CONFIG------------------------------------------------------
#source & target adresses
SOURCE = 'http://admin:pwd@host:5984'
TARGET = 'http://admin:pwd@host:5984'
#list of dbs (must have equal length)
SOURCE_DBS = ['db1', 'db2']
TARGET_DBS = SOURCE_DBS
#email credentials
GMAIL_USER = "gmail_user"
GMAIL_PWD = "gmail_pwd"
TO = "your email"
# set to False if no email desired
SEND_MAIL = True
#-----------------------------------------------------------------


class CheckReplicator(object):
    """
    Checks if a replication for a list of dbs on the target exists between two
    CouchDB instances.

    Input a connection string for the source and the target of the
    replication and provide two lists with the names of the databases you want
    to have replicated. (source_dbs[0] will be replicated to target_dbs[0] etc)

    Note that this class prints to console, so if you want to log progress,
    print output to file in cronjob.

    Note that all the dbs need to be created. It is smart to initiate the
    first continuous sync via futon or http api!
    """

    def __init__(self, source, target, source_dbs, target_dbs):
        db_equality = len(source_dbs) == len(target_dbs)
        assert db_equality, "source length must equal target length"

        self.source = couchdb.client.Server(source)
        self.target = couchdb.client.Server(target)
        self.source_string = source
        self.target_string = target

        self.desired_reps = zip(source_dbs, target_dbs)
        self._check_connections()
        self.active_reps = self._get_active_reps_on_target()

    def check(self):

        if self._check_if_all_desired_reps_exist():
            print str(datetime.datetime.now())[:-7] + " ok"
        else:
            self._fix_replications()

    def _check_if_all_desired_reps_exist(self):
        res = True
        for d in self.desired_reps:
            if d not in self.active_reps:
                res = False
        return res

    def _fix_replications(self):
        for d in self.desired_reps:
            if d not in self.active_reps:
                source_str = self._build_source_string(d[0])
                self.target.replicate(source_str, d[1], continuous=True)

        self.active_reps = self._get_active_reps_on_target()
        if not self._check_if_all_desired_reps_exist():
            raise EmailError("""
                             could not replicate all targets. Please
                             check if the couch instances are running
                             and all the dbs are created!
                             """
, SEND_MAIL)
        else:
            print str(datetime.datetime.now())[:-7] + " replicators created"

    def _build_source_string(self, db):
        string = self.source_string
        if string[-1] == '/':
            string = string + db
        else:
            string = string + '/' + db

        return string

    def _get_active_reps_on_target(self):

        tasks = self.target.tasks()

        #parse source and target of the task string
        #from the replication information
        active_reps = list()
        replications = [t['task'] for t in tasks if t['type'] == 'Replication']
        for r in replications:
            first_split = r.split('/ -> ')
            target = first_split[-1]
            second_split = first_split[-2].split('/')
            source = second_split[-1]
            active_reps.append((source, target))
        return active_reps

    def _check_connections(self):
        try:
            self.source.version()
        except:
            raise EmailError('could not connect to source', SEND_MAIL)

        try:
            self.target.version()
        except:
            raise EmailError('could not connect to target', SEND_MAIL)


class EmailError(Exception):

    def __init__(self, value, send_mail=False):
        self.value = value
        if send_mail:
            self._mail('Watchman Error', value)

    def __str__(self):
        return repr(self.value)

    def _mail(self, subject, text):
        msg = MIMEMultipart()

        msg['From'] = GMAIL_USER
        msg['To'] = TO
        msg['Subject'] = subject

        msg.attach(MIMEText(text))

        mailServer = smtplib.SMTP("smtp.gmail.com", 587)
        mailServer.ehlo()
        mailServer.starttls()
        mailServer.ehlo()
        mailServer.login(GMAIL_USER, GMAIL_PWD)
        mailServer.sendmail(GMAIL_USER, TO, msg.as_string())
        # Should be mailServer.quit(), but that crashes...
        mailServer.close()

#run it!
if __name__ == '__main__':
    try:
        check_replicator = CheckReplicator(SOURCE, TARGET, SOURCE_DBS, TARGET_DBS)
        check_replicator.check()
    except:
        raise EmailError('program code failed', SEND_MAIL)
October 1, 2012 by Nelmio in Development // Tags: , 3 Comments

A Tour of Go

Last week we had to write what a small specialized HTTP client that connects to a server and issues GET requests to fetch data. Simple enough.

The only issue is that the data provider required us to use maximum one connection at any time, using HTTP pipelining to issue all the GET requests in the same connection and then block until they respond with new data. In practice this is not very complex, but PHP (our go-to hammer) is not really shining in this area.

All the HTTP abstractions around curl/http streams like Buzz or Guzzle do not seem to support pipelining (if I missed that, let me know), and doing it by hand is not particularly fun. The http extension does support that through some undocumented feature though, but I didn’t have that extension at hand, so I skipped that. In any case, PHP is not really the best tool for doing this sort of long running script, although it has gotten much better in recent years. So we looked for something else, and long story short, we decided to try Go. It is a new-ish language that is aiming to replace C/C++/Java for systems programming, mixing and matching features from everywhere while focusing on language terseness.

So what about Go?

Well it is interesting. I just opened an editor and within a few hours I had a first prototype running, a few more hours of polishing and tweaking later I had a finished product that has now been running for a week. It handled a few tens of thousands of entries, and especially did not crash once ;) The standard library is quite complete for what I wanted to do here, and there is already quite a large ecosystem of extra libs available for more specific tasks.

The documentation. It is not awful but it is a young language, and it could benefit from having more examples and cross-references. On the other hand I started coding without reading any of the basics, so I had to learn a few things the hard way. One big plus though is that the documentation links to the sources, so you can very easily check out what something does behind the scenes.

Static typing is a big change coming from PHP which abstracts away a lot of this. However thanks to their light approach to typing, you don’t feel it too much. Basically using := for assignment you can define a variable and have its type automatically inferred by the value you assign it. That gives you typing without the verbosity. Then of course you get compile-time errors based on all this which is great for debugging. Overall this saved me from doing a bunch of mistakes, and did not cause so much pain.

The recover function is also quite a nice fit for the use case at hand. It allows you to recover from any fatal error, and then keep running some code. Using this I made sure the program never dies, because if anything bad happens it will error out and restart doing its job instantly. It reacts much faster than monit or similar watchdog programs would typically do since it is triggered by the application failure.

Finally I want to quickly touch on goroutines. It allows you to do parallel processing very very easily. All you have to do is prefix a function call with the go keyword, and it turns into a background task of sorts. You can then use channels to communicate and block between multiple goroutines. It is a great model and should allow people to write parallelized programs without the pain usually associated.

To sum up, Go is quite fun. I would not use it to build a large web app though. I don’t think the tools are there yet and we have a great ecosystem in PHP for that. But for simple utils I would definitely recommend it to anyone that feels like trying something new.

June 22, 2012 by Jordi Boggiano in Development // Tags: , , 3 Comments

An Appeal to All Package Managers

I work on Composer – a new PHP dependency manager – and it is now working quite well for managing PHP packages. We have a decent amount (and growing fast) of packages on Packagist. All is well. Yet, most PHP projects are websites, and they need some frontend libraries, be it JavaScript or CSS – I will use jQuery as an example that everyone can grasp easily. In some frameworks, you have plugins that bundle a copy of jQuery. Some people have also used Composer to hack together jQuery packages so that they can download it, and then they have some scripts to copy or symlink the files in the proper location. That is all very flaky, you end up with multiple copies of jQuery and if you are lucky you even get various different versions.

I have been thinking about it for a few months, and it seems like nothing exists out there. Every language out there has its own package manager, but everyone seems to be stuck with the same problem of frontend code. Obviously jQuery is used by virtually everyone. They can’t support Composer, Bundler, npm, pip and god knows what. Building a package manager for JS/CSS could work, but the community is huge and scattered and getting broad adoption is going to be very difficult.

The plan

As far as I can see, and given the way Composer works, it would be fairly easy to build a central package repository – similar to Packagist – where people can submit package definitions for frontend libs. For instance I could go and add a twitter bootstrap package, define the URL where you can download it including some placeholder for the version. Then all we need is a list of version that’s maintained. We could start by doing that by hand, or we can just point it to the git repo and it reads tags. That’s how Packagist works – except that it reads the composer.json file in the git repo to get all the metadata that in this case would be entered manually.

If we do this, we then end up with a central repository of CSS and JS packages, and we can integrate it in Composer, so that Composer packages can depend on jQuery and it just works. That would be a good start, but the great thing would be to get everyone on board. And I don’t mean everyone writing PHP. I mean everyone. The Ruby folks, the Python ones, Java, .NET, you name it. You all have package managers. All we have to do is agree on the API of the central package repository and on what metadata is needed. Then you can just add support for it in your package manager of choice, and we all benefit from the manual work put in to list packages. If it works, I’m sure some of the frontend packages will then add the metadata directly in their git/svn/.. repos so that we save on manual work. This would be a huge thing for everyone.

There are of course a few more details to settle regarding security and trust as well as exact package metadata, but I wanted to gauge the interest first, and then discuss further. I opened a frontend-packaging google group for that purpose, so if you are interested please join in. All it takes is a few open minded people and we could start one of the largest cross-language collaboration project ever. Sounds like fun!

May 31, 2012 by Jordi Boggiano in Development // Tags: , , , , 8 Comments

An Update On Composer

This weekend we have been busy hacking on Composer in our office together with Nils Adermann and Volker Dusch. We wanted to push the project forward a bit faster than the odd free evenings usually allow, and I would now like to introduce the changes we made.

Development versions handling

The former master-dev and similar *-dev versions we used to have were causing quite a few issues, so we decided to overhaul that behavior in a way that allowed us to get more consistency and fix a few long standing issues. For example dev versions can now be locked to exact commit revisions, and they will update to the latest revision when you do an update, no need to delete them from disk beforehand.

Basically dev releases are now simply branch names with a dev suffix – for numeric branches which are comparable – or a dev prefix for textual names that are not comparable, like feature branches and master. There is no way to specify the version manually anymore in your repository’s composer.json, since that was causing potentially dangerous issues with feature branches conflicting with the original ones.

If your package depended on a master-dev version, you should now depend on dev-master. If your package depended on something like the Symfony2 2.1.0-dev version, this one also is now dev-master since it is in the master branch. Older feature branches like 2.0-dev which is the 2.0 branch and not master are unaffected by this change.

This change will break many packages out there that rely on -dev packages of any kind, and we hope everyone will update their composer.json files as swiftly as possible to make the transition less painful.

The Packagist version database had to be reset for this change, so things will look at bit empty for a couple of hours while everything is re-crawled. None of the packages are lost and you should not have to do anything except having a bit of patience.

Dependency solver stability

Nils and Volker have been doing big progress on bugfixing and testing the solver. Those are mostly highly technical details that I will not dive into here. But long story short many old bugs should be fixed, and then some. It may obviously have introduced regressions, so if you encounter any issues please report them with your composer.json file so we can easily reproduce.

Documentation

Igor has spent quite a bit of time on documentation, which you can see on github for now, and which should be migrated to getcomposer.org soon.

Packagist / GitHub integration

Another great new feature coming from a pull request by Beau Simensen is the ability to let GitHub tell Packagist when you push new code to your repository. This should make package updates almost instant. It should be integrated into the GitHub Service Hooks soon enough, so if you don’t want to set it up by hand you can wait a bit, otherwise you can grab your API hook URL on your Packagist profile page, and add it in your repository.

Repositories configuration

It seemed that the way custom repositories are configured was confusing, so we took the chance to make it a bit clearer. Basically names are dropped and it’s all stored in a flatter structure that’s easier to remember. Documentation has been updated on Packagist.

All in all it has been quite a productive week-end and we will continue working on a few things today.

February 20, 2012 by Jordi Boggiano in Development, News // Tags: , 6 Comments

Composer: Part 2 – Impact

In the first part of this post I introduced Composer & Packagist. If you are not familiar with them please read part 1 first.

Impact

In this second part I would like to talk about a few things Composer could do for you, and the PHP community at large, once it is broadly adopted.

Common APIs and Shared Interfaces

You may have noticed that quite a lot of people are talking of and asking for more interoperability and cooperation between frameworks. It seems some PHP developers finally got tired of reinventing the wheel. That is great news. One way to provide this interoperability is through shared interfaces. The two main candidates there in my opinion are logging and caching. Two boring things that should just work, and where you always need tons of flexibility and tons of different backends, drivers, or whatever you want to call those. Almost every major framework and CMS out there have their own implementations of that stuff, yet none of them support all the options since there are too many.

The PHP Standards Group, an open mailing list discussing these interoperability questions has seen a recent proposal for a Cache Interface. One question raised was: How can those interfaces be distributed in each project that uses or implements them?

This is where I see Composer helping. Composer supports advanced relationships between packages, so to solve this issue you would need three parts (read carefully):

  • The psr/cache-interface package contains the interfaces, and requires a psr/cache virtual package.
  • Implementors of the interfaces (many libraries) all require the psr/cache-interface and also provide the psr/cache virtual package.
  • A framework that needs a cache library requires psr/cache-interface and hints the interface in its method signatures.

Then the user of that framework comes in, decides that he wants to use the Doctrine\Common cache implementation for example. By requireing doctrine/common, the psr/cache requirement of the psr/cache-interface would be satisfied. Both doctrine and the framework would use the interfaces from the psr/cache-interface package. No code duplication all over the place and everyone is happier. All those require and provide have version constraints on them, so the interfaces can easily be versioned so that Composer will not let you install things that do not work together.

Plugin Installs for Frameworks and Applications

Composer is built to be embedded in other frameworks, CMSs or other applications. Some parts are still a bit rough for that use case, but it is something that will be supported and encouraged. Reinventing the package management wheel is another thing that really should stop. Who am I to say this you ask? It is true, we are building a shiny new wheel as well. Yet I take comfort in the fact that we are trying to build a generic solution which will work for everybody.

Packages are easy to build – for those who insist on not reading the first part of this post: you drop a simple composer.json file and add the VCS repository to packagist.org. The goal is that building packages should be accessible. I would love it if TYPO3, Drupal or WordPress to name a few would use Composer as a library internally to handle their dependencies. The list of required packages does not have to be in a composer.json file, it can sit in a database just fine. That would mean that suddenly the WordPress plugin you are developing could depend on an external library to do some work, and you don’t have to embed the whole library code in your plugin’s repository. Autoloading would make it work magically as long as everyone respects PSR-0. Which brings me to my next point.

Promoting Standards

A few months back I was on IRC and someone linked his new library, who or what it was does not matter. I just noticed he used a home-made autoloader and asked him why he was not following the PSR-0 standard. The answer was “I just use a smarter autoloader, with fallback feature“. Now that’s great, maybe his solution is smarter in the way that it allows files and classes to be anywhere. But it messes with everybody else. No one can use that library unless they declare another autoloader just for it. Autoloading should really be a commodity that you do not have to lose time fixing.

By adopting and promoting the standard, I hope Composer will help raise awareness about it. If you follow PSR-0, Composer autoloads your packages. If you don’t, you are on your own. The more users start to rely on this, the more they will get annoyed when a package requires manual configuration to be autoloaded, which will put some pressure on the PSR-0 offenders.

Promoting Code Re-use

It is probably obvious, but having easy to use package management means you will use it more, and the more it is used, the more people will re-use and share code. I really hope to see many libraries pop up out there instead of the massive frameworks we had until recently.

This shift is already happening, the larger frameworks like Symfony2 and Zend Framework 2 have decoupled their internal components and it is now possible to use pieces of them individually. They start to look more like the PEAR repository, which is an aggregate of libraries that work well together, some depending on each other, but not all.

Single libraries out there are great but I see some value in these larger organizations enforcing some quality guidelines on their own code-base. In a way they act like brands. You know that if you use one of their packages you can expect a certain quality.

Renewed Interest in PHP

Overall, I believe that libraries like Buzz, Imagine and others can create a sort of DSL on top of the (sometimes really bad) PHP APIs. Many people have criticized PHP as a language for its inconsistencies and awkwardnesses. Fine. I am not going to argue with that. But I hope many of those people, if they are being honest, will agree that PHP as a platform is great. It runs everywhere, it does not require much configuration, it has an immense developer base.

If we have enough libraries that abstract away some of the language issues, I strongly believe PHP as a platform will have a bright future.

December 20, 2011 by Jordi Boggiano in Development // Tags: , 19 Comments

Composer: Part 1 – What & Why

You may have heard about Composer and Packagist lately. In short, Composer is a new package manager for PHP libraries. Quite a few people have been complaining about the lack of information, or just seemed confused as to what it was, or why the hell we would do such a thing. This is my attempt at clarifying things.

This second part of this post, Impact, has now been published.

What is it?

The Composer ecosystem is made of two main parts, both are available on GitHub. The development effort is being led by Nils Adermann and myself (Jordi Boggiano), but we already have more than 20 contributors which I would like to thank a bunch for helping.

Composer

Composer is the command-line utility with which you install packages. Many features and concepts are inspired by npm and Bundler, so you may recognize things here and there if you are familiar with those tools. It contains a dependency solver to be able to recursively resolve inter-package dependencies, a set of downloaders, installers and other fancy things.

Ultimately as a user, all you have to do is drop a composer.json file in your project and run composer.phar install. This composer.json file defines your project dependencies, and optionally configures composer (more on that later). Here is a minimal example to require one library:

{
    "require": {
        "monolog/monolog": "1.0.0"
    }
}

If we look at the package publisher side, you can see that there is some more metadata you can add to your package. This is basically to allow Packagist to show more useful information. One thing that is great though is that if your library follows the PSR-0 standard for class and files naming, you can declare it here (see the last two lines below) and Composer will generate an autoloader for the user that can load all of his project dependencies.

{
    "name": "monolog/monolog",
    "description": "Logging for PHP 5.3",
    "keywords": ["log","logging"],
    "homepage": "http://github.com/Seldaek/monolog",
    "type": "library",
    "license": "MIT",
    "authors": [
        {
            "name": "Jordi Boggiano",
            "email": "j.boggiano@seld.be",
            "homepage": "http://seld.be"
        }
    ],
    "require": {
        "php": ">=5.3.0"
    },
    "autoload": {
        "psr-0": {"Monolog": "src/"}
    }
}

Composer is distributed as a phar file. While that usually works out, if you can’t even get it to print a help with php composer.phar, you can refer to the Silex docs on pitfalls of phar files for steps you can take to make sure your PHP is configured properly.

Packagist

Packagist is the default package repository. You can submit your packages to it, and it will build new packages automatically whenever you create a tag or update a branch in your VCS repository. At the moment this is the only supported way to publish packages to it, but eventually we will allow you to upload package archives directly, if you fancy boring manual labor. You may have noticed in the composer.json above that there was no version, and that is because Packagist takes care of it, it creates (and updates) a master-dev version for my GitHub repo’s master branch, and then creates new versions whenever I tag.

You can run your own copy of Packagist if you like, but it is built for a large amount of packages, so we will soon release a smaller tool to generate repositories that should be easier to setup for small scale repositories.

If you have no interest in using Packagist with Composer, or want to add additional repositories, it is of course possible.

Why(s)?

Why do I need a package manager?

There is a huge trend of reinventing the wheel over and over in the PHP world. The lack of a package manager means that every library author has an incentive not to use any other library, otherwise users end up in dependency hell when they want to install it. A package manager solves that since users do not have to care anymore about what your library depends on, all they need to know is they want to use your stuff. Please think about it real hard, let it sink, I will get back to that in the next post.

So we started working on Composer because there is no satisfactory solution at the moment for PHP, and that is quite unacceptable in this day and age. Of course with such a bold statement, you may be wondering:

Why not use PEAR?

While PEAR was and remains a viable option to some people, many have also been dissatisfied with it for various reasons. Composer has a very different philosophy, and it probably will not please everybody either. The main aspect that differs is that PEAR started as a system-wide package manager, much like apt-get or other similar solutions.

That approach does not work very well when you have many projects running on one machine, some of them 5 years old and depending on outdated versions of a library a newer project also uses. You can’t easily install both versions at the same time, and lots of frustration ensues.

Another issue is that one project’s dependencies become very fuzzy, since you code against code that is installed somewhere on your system, you can easily forget to mention in your README that your app depends on library X. Future-you or another guy comes along, tries to setup the project and is left in a run -> see error -> install lib -> run loop until all errors are gone. If he is really out of luck, he misses one dependency that is rarely used, and something fails unnoticed later on.

Composer on the other hand forces you to declare your project dependencies in a one-stop location (composer.json at the root). You just checkout the code, install dependencies, and they will sit in the project directory, not disturbing anything else on the machine. Another related feature is the composer.lock file that is generated when you install or update dependencies. It stores the exact version of every dependency that was used. If you commit it, anyone checking out the project will be able to install exactly the same versions as you did when you last updated that file, avoiding issues because of minor incompatibilities or regressions in different versions of a dependency. If you ever had bugs appear only on one team member’s machine while the others were fine because of some too-new or too-old version of something, you will know this is very useful.

Another notable difference, although some may not care about this, is that there is no approval process to have your package included on Packagist. While our vendor-name/package-name convention resembles PEAR’s channel/package, we do not have channels. All repositories contain packages that go into one big package pool, and then the solver figures out which packages fit your requirements, no matter where they come from.

Why JSON for packages?

It is a recurring question so I will answer it, hopefully for the last time. The short answer is because. The longer one is that there are many options (yaml, json, xml, php, ini, whatever.), each have their fan-base, and each have their haters. Whatever we would have picked, someone would be complaining. If you think it is a stupid decision, I am sorry to announce you are in the group selected to be the folks complaining, but it is not going to change. Please try not to focus on such a detail, and look at the bigger picture.

Where are we now?

I have delayed writing this post for quite a long time. I wanted it to be all nice and shiny before announcing anything. Unfortunately it is not as polished as I would like yet, but we are getting there. Composer can install itself (well, its dependencies) via Packagist, and many people have played with it and looking to integrate it in their work environments.

Here are the main points that still need some love, and of course if you would like to help you can join us on IRC (freenode #composer-dev) or on the mailing list.

Documentation

Documentation – or the lack thereof – is a huge problem right now. This post is a first step in that direction, and we will definitely work on more formal documentation in the future. You could too.

Solver bugs

The dependency solver is a complex beast, it has been ported from C code and it still has some rough edges. Not much to say here, we just need people to try it and report bugs. Bonus points if you can write a unit test that reproduces the issue.

Private repositories

This is a big topic that we need to address as well. Installing closed-source packages is of course necessary in most companies, and we will definitely work on it once the basics are working well and the open-source use case is covered. If you have some time or money to invest in that and want it to happen ASAP, please get in touch with us.

Global installs

As I said, we work with local installs by default, and that will not change for everything that is directly project-related. That being said, there is a whole set of CLI-tools for testing/QA or other purposes that would benefit from being installed system-wide and executable from anywhere. It is already possible to do a local install of those in your home dir and then add the bin directory of that install to your PATH of course, but we would like to support a more streamlined experience.

Part two will come next week, covering a few use cases, visions and hopes we have for Composer and PHP as a whole. To stay up to date you can follow @packagist or myself on twitter.

Update: You can now read Part2

December 8, 2011 by Jordi Boggiano in Development // Tags: , 47 Comments

CORS with Sencha Touch

A while back I was working on the mobile version of techup.ch which you can find on m.techup.ch. The web application is built with Sencha Touch. I did not want to deploy to m.techup.ch yet, which is where I ran into issues with the same origin policy.

In order to prevent web applications gaining access to things they should not have access to there is a same origin policy, which means that AJAX requests can only be made to the same host and port that the website is hosted on. This prevents CSRF attacks, as an attacker cannot make your browser do arbitrary actions on remote websites.

Our Sencha Touch app however relies on a JSON API, that is served from techup.ch. Since the app itself is not hosted on that domain (yet), it cannot access the API.

A workaround: JSONP

While AJAX is restricted by the same origin policy, script tags are not. It is possible to include script tags to a remote site, which is often used to embed widgets from sites like Facebook or Twitter. It can also be abused to fetch data from a remote domain.

This works by adding a script-tag proxy pointing to the API, including a callback parameter in the query string, for example:

1
http://techup.ch/api/events/upcoming.json?callback=processData

The API will detect the callback parameter and pad the response with a javascript call to the data it returns.

Conventional Reponse

{
    "events": [
        {
            "id": 361,
            "name": "Linalis LPI 201"
        }
    ]
}

JSONP Response

processData({
    "events": [
        {
            "id": 361,
            "name": "Linalis LPI 201"
        }
    ]
});

And this is what the “P” in JSONP is. JSON with padding.

Of course the processData function has to exist. In fact, jQuery’s AJAX function has built in support for JSONP. If you have callback=? in the requested URL, it will automatically use JSONP for the request. Sencha Touch also supports this through Ext.data.ScriptTagProxy.

But this is more a hack than a real solution. Enter CORS.

Cross-Origin Resource Sharing

CORS is a W3C Working Draft which describes an extension to HTTP for allowing browsers to issue AJAX request across domains. The server serving this request can send additional headers notifying the client of these additional permissions.

The only header that needs to be set is:

Access-Control-Allow-Origin: *

This will allow AJAX requests from any domain. Instead of the wildcard, you could also specify allowed domains explicitly. Only set the wildcard if it is safe to do so. You need to be aware that any site could get any user to make a request to those resources on your site that have this header, which can very easily lead to CSRF attacks.

Now, Sencha Touch (as well as many other AJAX libraries) will also send a X-Requested-With header, allowing the server to detect that the request was sent through JavaScript. CORS forces the server to specify which headers can be sent by the client. That is done as folllows:

Access-Control-Allow-Headers: x-requested-with

This is a comma-delimited list, so you can add more headers if needed.

If you want to allow other HTTP methods than GET, you’ll have to specify that explicitly with yet another header:

Access-Control-Request-Method: GET,POST

Authentication

By default the browser will not send any cookies with CORS requests. You can however set the Access-Control-Allow-Credentials header with a value of

1
true

, which will allow cookies to be sent. In this case you may however want to explicitly define a range of allowed origin domains.

Alternatively you can handle authentication yourself. In this case the user will have to provide username and password to the app, which will then make a request to the API. The API will respond with an authentication token, which can be used in subsequent authenticated requests to the API. This way you do not have to rely on cookies. You can store this token in localStorage, so you don’t loose it on page reloads.

Browser support

A recent article describes the browser support as follows:

  • Webkit browsers: good
  • Gecko browsers: good
  • Trident browsers (Internet Explorer 8+): good with some gotchas
  • Opera: very sucky a.k.a. non existent

Since we are targeting mobile here, which is in this case pretty much webkit only, we can just use it.

Conclusion

CORS is a nice solution for remote API access from mobile web applications.

Check out enable-cors.org.

November 3, 2011 by Igor Wiedler in Development // Tags: , , 13 Comments