Hadoop
Overview
During the 12.04 cycle hadoop and a selection of components will be packaged to support Hadoop charming and will be avaliable in PPA's:
- hadoop 1.0.1
- hive 0.8.1
- pig 0.9.2
- hbase 0.92.0 (maybe 0.92.1)
- zookeeper 3.4.3
This components are secondary and may be packaged:
- hcatalog 0.2.0
PPA's
A team (http://launchpad.net/~hadoop-ubuntu) and three PPA's have been setup on launchpad; all three PPA's have been enabled for armel, armhf and powerpc so please check before adding anyone else to the team.
dev
ppa:hadoop-ubuntu/dev
Development versions of packages; new versions prior to full testing in testing
testing
ppa:hadoop-ubuntu/testing
Versions of packages which are ready for testing.
stable
ppa:hadoop-ubuntu/stable
Stable versions of packages which have been tested.
Packaging Details
Initial Packaging
Packages should be built based on the source packages built by bigtop (http://incubator.apache.org/bigtop http://github.com/apache/bigtop). However most packages are based on older debhelper so d/rules etc should be rationalised to use new features:
git clone https://github.com/apache/bigtop mkdir -p hadoop/debian cp bigtop/bigtop-packages/src/deb/hadoop/* hadoop/debian cp bigtop/bigtop-packages/src/common/hadoop/* hadoop/debian
Packages should ship upstart configurations; see hadoop for examples.
Upstream Source
Packages should be based on upstream binary distributions; Java components should not be rebuild but native components will need to be rebuild with appropriate patches for precise + official ports.
Packages should define a target in debian/rules called get-orig-source which pulls the correct version of the upstream distribution from an appropriate upstream source. Ideally checksums should be validated.
Version Control
Packaging only branches (just the debian folder) should be created for all packages and stored under the hadoop-ubuntu team, e.g. lp:~hadoop-ubuntu/ubuntu/precise/hadoop/trunk
Packages should be of source/format - 3.0 (quilt) and should have packaging version numbers i.e. 0.20.205.0-0ubuntu1~hadoopX. A ~hadoopX suffix should be used to support multiple upload iterations to the PPA's.
Building Packages Locally
Packages should be builable using the following procedure:
bzr branch lp:~hadoop-ubuntu/ubuntu/precise/hadoop/trunk hadoop cd hadoop ./debian/rules get-orig-source bzr bd -S cd .. sbuild -A -d precise hadoop*.dsc
Uploading Packages to PPA's
Once an initial orig.tar.gz has been uploaded to PPA subsequent packaging updates can be uploaded using:
cd hadoop bzr bd -S -- -sd cd .. dput ppa:hadoop-ubuntu/dev hadoop*_source.changes
Finding JAVA_HOME
Hadoop and friends need reliable JAVA_HOME detection and setup; this is provided by the bigtop-utils package (lp:~hadoop-ubuntu/+junk/bigtop-utils).
. /usr/lib/bigtop-utils/bigtop-detect-javahome
Most of the bigtop packages use this in the provided init scripts. See hadoop for examples.
ServerTeam/Hadoop (last edited 2012-03-09 17:45:28 by james-page)