Purpose ======= The goal of Bagheera is to provide a complete software package management system that satisfies the following fundamental goals: * easy package format that closely resembles command-line installation * transparent and mixed usage of source and binary packages * automatic dependency handling for all package operations * easy maintenance of a system for installation, update and removal * user-friendly, robust, stable, efficient, embeddable and self-contained * powerful query language * multiple prioritized parent repositories which can be distributed * safe installation through sandbox technology * secure, even when building from source * use flags for the configuration of personal feature preferences * easy package creation through developer tools Architecture ============ +---------------------+ +---------------------+ +---------------------+ | Remote repository 1 | | Remote repository 2 | | Remote repository 3 | +----------+----------+ +----------+----------+ +----------+----------+ | | | network | | | boundary | | | -----------|------------------------|------------------------|------------ | | | +---------------------+ +---------------------+ +---------------------+ | Synced repository 1 | | Synced repository 2 | | Synced repository 3 | +------------------+--+ +----------+----------+ +--+------------------+ \ | / \ | / \ | / +---------------------------------+ | Prioritized repository merger | +----------------+----------------+ | | +-----------------------------+ | Locally merged repository | +--------------+--------------+ | | parsing and conversion of packages | +---------+ +------------------+ | ANTLR +---------------+ Package parser | +--+------+ +------------------+ | | | +-----------------+ | storage of packages and structures | | Bash | | | +-----------+ | +-------+ | +---------+ | | | | | | Sandbox | | | | McKoi | | +---------+ +-----+ +--+ RDBMS +--+ | +-----------------+ | | | | | | Baggy execution | | +-------+ | | | environtment | | | | +-----------------+ | | | +-----------------------+-----+ +-----+----------------------------+ | | BQL Engine | | | | +-----------------------------+ | Direct SQL access | +--+--------------------------------+ | | | BQL Parser | | | +-----------------------------------+ +----------------------------------+ +------------------------------------------------------------------------+ | Java management API | | +--------------------------------------------------------------+ | | +---------------+ +------------+ +------------+ +------------+ | | | C++ API | | | | | | | | | | +---------+ | Python API | | Ruby API | | Repository | | | | | +-------+ | | | | | convertor | | | | | | C API | | | | | | | +---------+ +-----+ +-------+ +------------+ +------------+ +------------+ +---------------------------------------------------------+ +------------+ | Console and GUI package | | Repository | | management tools | | server | +---------------------------------------------------------+ +------+-----+ | network | boundary | -------------------------------------------------------------------|------ | +-------------------+ | Synced repository | +-------------------+ Global settings =============== /etc/bagheera/settings-custom.conf /etc/bagheera/settings-default.conf These files contains a list of properties that allow users to easily configure Bagheera to their likings. The properties in the custom file override those in the default file. The latter is usually provided by the distribution for sensible default behaviour. The following property keys are supported: use Configuration of use flags that will be used for all the packages. licenses List of license identifiers that have been accepted (this can automatically be modified during installation). repositories The list of repositories that will be used to compose the local repository from. platform The platform on which the packages will be installed (X86, PPC, ...) chost The target compilation host that will be used. cflags The compilation flags that will be used to compile C sources. cxxflags The compilation flags that will be used to compile C++ sources. fflags The compilation flags that will be used to compile Fortran sources. jflags The compilation flags that will be used to compile Java sources. makeopts The options that will be provided to the make command. strip Indicates whether the binaries and libraries should be stripped. root The root of the package installation. type Default package type that will be installed, this is either 'source' or 'binary'. logfile The logfile that will be used to redirect all compilation and installation output to. tmpdir The temporary directory that will be used. repositoriesdir The directory to which to remote repositories will be synchronized and in which the local repository will be built. Each repository gets a dedicated sub-directory. archivesdir The directory in which the downloaded source and binary archives will be downloaded. configsdir The directory in which package configuration tasks will be written. messagesdir The directory in which package message files will be written. rollback Enabled or disable the rollback functionality. rollbackdir The directory in which the rollback data will be written. /etc/bagheera/mirrors.conf This optional file makes it possible for the user to override any mirror definitions that have been done in the repositories. The format is exactly the same as the setup/mirrors.conf file in repositories and offers a convenient way to download from geographically closer servers. Runtime parameters ================== --use Allows temporary modification of use flags for the current execution. For example: baginst --use="-gnome ipv6" --licenses Allows temporary modification of accepted licensed for the current execution. For example: baginst --licenses="GPL LPGL -MPL" The following simply temporarely override the system-wide settings with the same names. --debug Indicates whether debug information should be printed. --makeopts --chost --cflags --cxxflags --fflags --jflags --strip --root --type --logfile --tmpdir --repositoriesdir --archivesdir --configsdir --messagesdir Repository setup files ====================== The following files are used to setup repositories. They will be automatically read and combined whenever the local repository is synchronized. In case of conflicts, the data of repositories with a higher priority will be used. After the synchronization, Bagheera will automatically interprete the contents of these setup files and import them into the internal database. Removed data will only be erased from the internal database when no installed package is using it anymore. setup/categories.conf This file provides a list of all the categories that are known by the distribution. It is just a list of lowercase words that are separated by a newline. Optional text after the keyword provides a description of the category. For example : development Used for the development of software. c Supports the C programming language editor Offers text file editing facilities setup/mirrors.conf This file provides a list of all mirror identifiers that can be used in the list of the downloadable files that a packages requires. Each mirror needs to be defined on a seperated line. The parts of a mirror specification are seperated by whitespace. The first token is the identifier and the next tokens, the list of locations that can be used for this mirror. For example: sourceforge http://osdn.dl.sourceforge.net/sourceforge gnome ftp://ftp.gnome.org/pub/gnome \ ftp://archive.progeny.com/GNOME \ ftp://ftp.sunet.se/pub/x11/GNOME gnu ftp://mirrors.kernel.org/gnu http://mirrors.kernel.org/gnu \ ftp://gatekeeper.dec.com/pub/GNU/ \ ftp://ftp.keystealth.org/pub/gnu/ setup/licenses/{KEY1,KEY2,KEY3,...} Each known license has a unique identification key that is the name of a file in this directory. These files contain the exact verbatim text of the license itself. If the first line of the content starts with the identifier of a known license, the license file is interpreted as being a 'license group' and the list of license keys (one per line) define the licenses that are part of this group. setup/useflags.conf This file provides a list of all the use flags that are known by the distribution. It is just a list of lowercase words (the underscore char is prohibited for binary package support) that are seperated by a newline. Optional text after the keyword provides a description of the use flag. For example : gnome Adds GNOME support gtk+ Adds support for gtk+ (The GIMP Toolkit) ssl Adds support for Secure Socket Layer connections setup/virtuals.conf This file provides a list of all the virtual packages that are known by the distribution. It is just a list of lowercase words that are separated by a newline. Optional text after the keyword provides a description of the virtual. For example : virtual-glibc Provides glibc support. virtual-x11 Provides x11 support. virtual-jdk Provides java development kit support. Baggy format ============ Package name (PACKAGEID) ------------ Each package has a name, a version and optionally a revision. Example: my-nice-foo-1.0_pre1-r1.bag The name conforms to the following regular expression: name ~= [0-9a-z](([0-9]|-?[a-z]+)[\._]?)* This means that my-nice-package is correct, but My-Nice-Package is not, lower-case is enforced. Also note that the regular expression seems more complex than it should be, but this is to ensure that the minus sign is never followed by a number in the package name. Otherwise it's much too difficult to write a correct parser. The complete version is divided into four parts: * the upstream version (version), * the upstream sub-version (subversion), * version suffix type (suffixtype), and * the suffix version (suffixversion). Here are the regular expressions that define them: version ~= [0-9]+(\.?[0-9]+)* subversion ~= [0-9]{0,4} suffixtype ~= _(scm|alpha|beta|pre|rc|p) suffixversion ~= [0-9]+ Below is the precedence of the suffix types : 1.0_scm < 1.0_alpha < 1.0_beta < 1.0_pre < 1.0_rc < 1.0_p The scm suffix type is a bit special and this is explained in detail in the paragraph about SCM support. The revision has to be conform to the following regular expression: revision ~= \-r[0-9]+ A complete package name is thus constructed like this: name-fullversion(revision)? The revision is thus optional Available bash variables ------------------------ The following correspond to the settings and execution flags : PLATFORM CHOST CFLAGS CXXFLAGS FFLAGS JFLAGS DEBUG ROOT The following correspond to the variables specified in the package : DISPLAYNAME SUMMARY DESCRIPTION HOMEPAGE CATEGORIES LICENSES USES PROVIDES COMPATIDS CDEPEND RDEPEND MAINTAINER ORGANIZATION The following are special variables that are available to provide additional context information to the installation script : TYPE .... The type of package that has to be installed, this is either source or binary. In case of source, ${TYPE} will be : "SRC" In case of binary; ${TYPE} will be : ${PLATFORM} DOWNLOADS ......... This is an associative array of all DOWNLOAD entries, based on the installation type and platform. For example, if these entries have been defined as follows : # DOWNLOADS # SRC protocol://url/archivefile1 # SRC protocol://url/archivefile2 # X86 protocol://url/archivefile3 # PPC protocol://url/archivefile4 ${DOWNLOADS[SRC]} will be : "protocol://url/archivefile1 protocol://url/archivefile2" ${DOWNLOADS[X86]} will be : "protocol://url/archivefile3" ${DOWNLOADS[PPC]} will be : "protocol://url/archivefile4" If this is combined with the TYPE variable, ${DOWNLOADS[${TYPE}]} will provide the downloads that needs to be performed for the active installation type and platform. DOWNLOADNAMES ............. Contains the list of download names that are active for the installation. These make it possible to react correctly according to conditional download directions that have been set up in the package header. For example, if the downloads have been defined as follows : # DOWNLOADS # IF archivefile-1.0.tar.bz2 # NAME base-1.0 # SRC protocol://url/archivefile-1.0.diff.bz2 # SRC protocol://url/archivefile-1.1.diff.bz2 # SRC protocol://url/archivefile-1.2.diff.bz2 # ELIF archivefile-1.1.tar.bz2 # NAME base-1.1 # SRC protocol://url/archivefile-1.1.diff.bz2 # SRC protocol://url/archivefile-1.2.diff.bz2 # ELIF archivefile-1.2.tar.bz2 # NAME base-1.2 # SRC protocol://url/archivefile-1.2.diff.bz2 # ELSE # SRC protocol://url/archivefile-1.3.tar.bz2 # ENDIF and the 'archivefile-1.1.tar.bz2' was already present in the repository of downloaded archives, ${DOWNLOADNAMES} will be : "base-1.1" Build-related ............. TMP The location for the temporary stage files during installation. B The location for the temporary files of this package installation. T The package specific temp subdir for use during build. W The package specific work subdir for use during build. D The package specific image subdir for use during build. S The package specific directory where the sources can be found. These variables are derived from the tmpdir setting and the package name. Consider the following context : ${tmpdir} is /var/tmp package is foo-1.0_pre1-r1 The above variables will then be as follows : TMP /var/tmp/bagheera B /var/tmp/bagheera/foo-1.0_pre1-r1 T /var/tmp/bagheera/foo-1.0_pre1-r1/temp D /var/tmp/bagheera/foo-1.0_pre1-r1/image W /var/tmp/bagheera/foo-1.0_pre1-r1/work S /var/tmp/bagheera/foo-1.0_pre1-r1/work/foo-1.0 Package-related ............... PF The full name of package with version and revision. P The name of the package with version. PN The name of the package. PV The version of the package. PR The revision of the package. PVR The version and the revision of the package. If one considers the package my-nice-foo-1.0_pre1-r1.bag, the variables will thus be as follows : ${PF} will be : "my-nice-foo-1.0_pre1-r1" ${P} will be : "my-nice-foo-1.0_pre1" ${PN} will be : "my-nice-foo" ${PV} will be : "1.0_pre1" ${PR} will be : "r1" ${PVR} will be : "1.0_pre1-r1" Available bash functions ------------------------ Sandbox manipulation .................... addread Adds a path to the list of readable paths. addwrite Adds a path to the list of writable paths. adddeny Adds a path to the list of denied paths. addpredict Adds a path to the list of predicted conflicts. Package writing convenience ........................... use Check for the presence of a use flag. has Checks for the occurance of a value in a list. try Execute a command and exit when it wasn't successful. bunpack Automatically unpack a downloaded archive to the current dir. bconfigure Automatically call configure with common parameters. binstall Automatically execute make install with common parameters. bmake Execute make with common options. bpatch Tries several patchlevels and applies the one which returned no errors. bsed Replace a file after processing it with a sed command (by default strips away occurances of the ${D} dir) Deferred message display support ................................ notice Stores a notice message in the deferred message spool. warning Stores a warning message in the deferred message spool. error Stores a error message in the deferred message spool. Seperated package configuration tasks support .............................................. addtask Adds a new configuration task that manually has to be executed. Staged installation support ........................... The following commands make it easier to work with the staged installation directory. All destinations are automatically prefixed with the correct staging directory. into Sets the destination dir for the next bin, sbin & lib actions. docinto Sets the destination dir for the next doc actions. exeinto Sets the destination dir for the next exe actions. insinto Sets the destination dir for the next inst actions. diropts Sets the file access permissions for the next dir actions. exeopts Sets the file access permissions for the next exe actions. insopts Sets the file access permissions for the next inst actions. libopts Sets the file access permissions for the next lib actions. dobin Installs a binary into the bin dest with standard perms. dodir Creates a new directory with the dir settings (dest+perm). dodoc Installs a doc file into the doc dest with standard perms. doexe Installs an executable with the exe settings (dest+perm). dohard Creates a hard link. doinfo Installs an info file in the standard dir with standard perms. doins Installs a file with the ins settings (dest+perm). dojar Installs a jar file in the standard dir with standard perms. dolib Installs a library with the lib settings (dest+perm). dolib.a Installs a static lib into the lib dest with standard perms. dolib.so Installs a dynamic lib into the lib dest with standard perms. doman Installs a man file in the standard dir with standard perms. dosbin Installs an sbin file into the sbin dest with standard perms. dosym Creates a symbolic link. newbin Installs a binary with dobin under another name. newdoc Installs a doc file with dodoc under another name. newexe Installs an executable with doexe under another name. newins Installs a file with doins under another name. newlib.a Installs a static lib with dolib.a under another name. newlib.so Installs a dynamic lib with dolib.so under another name. newman Installs a man file with doman under another name. newsbin Installs an sbin file with dosbin under another name. fowners Changes the ownership of files in the staging dir. fperms Changes the permissions of files in the staging dir. [ TODO ] dohtml prepinfo preplib preplib.so prepman prepstrip prepall prepalldocs prepallinfo prepallman prepallstrip Package format -------------- A Bagheera package (baggy) consists out of two major parts : the integration header and the installation logic. The header provides all the information that is required to integrate the package in the system. The header is delimited by '##baggy##' identifiers. Between those identifiers, the lines that start with '# ' will be interpreted as meta data during the import of the package into the database. Lines with other prefixes will be ignored during the import process. The header is divided into a series of declaration contexts. Each context starts with a name that identifies it and all data that follows it will be interpreted in the active context. The following contexts are available : DISPLAYNAME, SUMMARY, DESCRIPTION, HOMEPAGE, MAINTAINER, ORGANIZATION, CATEGORIES, LICENSES, DOWNLOADS, DIGESTS, CDEPEND, RDEPEND, USES, PROVIDES, COMPATIDS, BOUNDARY CHANGELOG, CONTENTS. The installation logic is specified by regular bash functions that all contain default behaviour that would install the ideal package automatically. However, this is rarely applicable and developers are thus able to overridde these functions to provide custom behaviour. ## baggy ## # DISPLAYNAME # the name of the package as it should appear on screen # # SUMMARY # very short description of the package # # DESCRIPTION # several lines of description that provide detailed information about # the package. # # HOMEPAGE # url # # MAINTAINER # name # # ORGANIZATION # name (url) # # CATEGORIES # list of categories from a predefined list. They can span multiple lines # and are seperated by spaces. # # LICENSES # list of licenses from a predefined list. They can span multiple lines # and are seperated by spaces. # # DOWNLOADS # SRC protocol://url/archivefile1 (source installation) # SRC protocol://url/archivefile2 (source installation) # SRC protocol://url/archivefile3 (source installation) # X86 protocol://url/archivefile4 (binary installation) # PPC protocol://url/archivefile5 (binary installation) # IF archivefile-1.0.tar.bz2 # NAME base-1.0 # SRC protocol://url/archivefile-1.0.diff.bz2 # SRC protocol://url/archivefile-1.1.diff.bz2 # SRC protocol://url/archivefile-1.2.diff.bz2 # ELIF archivefile-1.1.tar.bz2 # NAME base-1.1 # SRC protocol://url/archivefile-1.1.diff.bz2 # SRC protocol://url/archivefile-1.2.diff.bz2 # ELIF archivefile-1.2.tar.bz2 # NAME base-1.2 # SRC protocol://url/archivefile-1.2.diff.bz2 # ELSE # SRC protocol://url/archivefile-1.3.tar.bz2 # ENDIF # # DIGESTS # MD5 checksum archivefile1 (other digest types are possible) # MD5 checksum archivefile2 # MD5 checksum archivefile3 # MD5 checksum archivefile4 # MD5 checksum archivefile5 # # CDEPEND # sequence of compile-time dependencies. The definition can span multiple # lines. # # RDEPEND # sequence of runtime dependencies. The definition can span multiple # lines. # # USES # list of use flags from a predefined list. They can span multiple lines # and are seperated by spaces. # # PROVIDES # list of virtual packages from a predefined list. They can span multiple # lines and are seperated by spaces. # # COMPATIDS # list of provided compatibility identifiers, They can span multiple lines # and are seperated by spaces. # # BOUNDARY # explicit or none (none is the default when omitted) # # CHANGELOG # version-revision # * date name # changes # * date name # changes # version-revision # * date name # changes # # CONTENTS # /path/to/file1 # /path/to/file2 # /path/to/file3 # /path/to/file4 # useflag1 # -/path/to/file1 # +/path/to/file5 # useflag1 useflag2 # -/path/to/file1 # +/path/to/file5 # +/path/to/file6 ## baggy ## # Downloads the package, usually doesn't have to be overridden src_download (); # Unpacks the package, usually doesn't have to be overridden src_unpack (); # Configure the package, call the configure-script etc. src_configure (); # Build the package, most often doesn't have to be overriden # Will call make src_build (); # Installs the package in the staging area src_install (); # Called before installation pkg_preinst (); # Called after installation pkg_postinst (); # Called before removal of package pkg_preremove (); # Called after removal of package pkg_postremove (); Features ======== Well designed, well maintained and robust core library ------------------------------------------------------ While it's important to encourage developers to contribute to Bagheera, the core library will not be open for development by every John Doe hacker that comes along. The developers that work on the core library know very well where to go, what to do and how to do it. They use the tools that are the best to achieve their goals and the tools that they know best. Through the use of BQL and language bindings, other developers have the possiblity to extend Bagheera with additional tools as much as they want. It's important that the core functionalities are only being worked on by people that know why every decision what made and how to use the architecture that has been chosen. Every functionality has to have associated unit-tests to ensure consistant stability and robustness throughout the development. Packages are simple bash scripts -------------------------------- Packages mimic closely what a user types on the commandline to manually install a package. Each package has to have an identical format and should limit the use of internal custom bash functions which obscure immediate logical and execution flow of a package. This means that ideally, by setting up the required environmental variables, a package should be executable as a simple bash script on platforms that don't have Bagheera installed. Single-file packages -------------------- Packages shouldn't contain *any* additional files in the repository. Each external file should be fetched remotely. This is to keep the repository clean and to promote upstream patch submission. If it's needed, short patches can be inlined like this : patch -Nfp0 << 'EOP' --- Makefile Mon Feb 19 18:35:31 2001 +++ Makefile_new Mon Feb 19 18:35:06 2001 @@ -7,7 +7,7 @@ CC= gcc # Install prefix -PREFIX=/usr +PREFIX=/opt/grip # Installation directory -- where the binary will go INSTALLDIR= $(PREFIX)/bin EOP If the developer doesn't want this to be cluttered inside the code logic, he can put them in dedicated functions at the end of the package script and call those functions from where you need the patches to be applied. The same can be done for all other files (rc scripts, etc etc). Sandbox protection for packages ------------------------------- During the compilation and installation of packages, a sandbox prevents any libc calls from writing to directories outside of the staging area. This ensures that all files are being monitored and that the system is always coherent. Integrated relational database ------------------------------ Instead of storing the local package database and its meta-data as a directory structure on the local filesystem, a relational database contains all structures to facilitate the querying of package data, dependencies and associated data. This means that all relations are constantly present and that it's not necessary to traverse and parse a local directory tree at the execution of each command. Enhanced safe merging --------------------- Each package installation together with its associated dependencies are installed as a whole to a staging area (not on a per-package basis). This makes a compilation error anywhere in the installation process prevent that only a partial dependency tree is installed. It should however be possible to resume the installation after making package modifications without have to recompile all dependencies again (intelligent reuse of the staging area). Multiple prioritized parent repositories ---------------------------------------- Bagheera only installs from a local package tree. It however synchronizes this tree from any number of parent repositories. The initial protocols to sync from (not sync to) will be rsync, cvs, http, ftp and bql (for the installed package tree). The importance of a repository is being determined from the precendency of repository urls in the configuration. Packages coming from more important repositories replace less important versions if there's a conflict. The list of repositories is defined as a comma-seperated list of repository locators. The format of a repository locator is as follows : identifier:[login[/password]@]protocol://hostname[/location] identifier The identifier that will be used for this repository. 'local' is reserved for the local repository and can't be used by any other one. login The optional login to connect to the remote repository. password The optional password to connect to the remote repository. protocol The protocol that has to be used to connect to the remote repository. hostname The name of the host where remote repository is located. location The optional location on the remote host to access the repository. A complete repository string can for example be (line wrapped, but it should normally be all on one line): stable:rsync://www.papuaos.org/stable,dev:rsync://www.papuaos.org/dev, experimental:gbevin/password@ftp://experimental.papuaos.org Internally each repository will be synchronized / downloaded to a dedicated directory that corresponds to the identifier of the repository. The local repository will then afterwards be built by respecting the specified order of remote repositories. Distributed repository ---------------------- The database of installed packages can also serve as a repository to make it possible to maintain the package installation on one computer and sync other computers on a company's network from the single maintained computer. This makes updating custom company packages very easy. Intelligent system configuration file protection ------------------------------------------------ Directories of the underlying operating system that contain configuration files are protected. This means that a package can not replace any changes that have been made to the configuration files. Bagheera however examines all the config files to detect if they were not changed from the original installed version. If the original version is still present, the configuration file replacement is permitted anyway. Deferred message display ------------------------ Any messages that are being displayed during the compilation or installation of packages are logged to a spool directory and can optionally be displayed automatically. If the display is not turned on, the user is warned about the presence of new messages. This ensures that important information is not lost in the flow of output. Seperated package configuration tasks ------------------------------------- Each package has the possibility to provide a configuration files that are being installed in a dedicated repository. These configuration files contain all the commands that configure a package further than a standard installation (eg. database initialization for mysql and postgres, ssh key generation, etc etc). After the installation of such packages, users are warned about the presence of new configuration files and they have to execute them manually themselves. Intuitive package versioning with overview of changes ----------------------------------------------------- Users should easily be able to see what the changes and benefits are from installing new versions of already installed packages. Each package has to specify which changes were made by marking release versions with clear boundaries, these changes can then be displayed to the user so that the can clearly see what will happen if the upgrade to a new version. Virtual packages ---------------- Virtual packages provide an abstract way to depend on functionalities that are provided by several packages in an transparantly interchangeable way. Packages are able to define which virtual packages they provide. Virtual packages should be defined beforehand in a list of supported virtuals. When virtuals are used in dependencies and need to be resolved, Bagheera queries the database to look for packages that provide the required virtual functionality. If only one such package is found, it is automatically used. If multiple packages are found, the user is requested to make a selection. USE flags --------- The purpose of USE flags is to allow users to automatically enable or disable certain optional package features. For example : Let's say you're a Gnome fan. Therefore, you'd like any package that has optional support for Gnome, to use it. In this case, you add 'gnome' to the active use flags and Bagheera will automatically add Gnome functionalities to packages that support. Likewise, if you don't want optional Gnome features to be added to your packages, simply make sure that 'gnome' is not set in the USE flags. Support for integration in GUI frontends ---------------------------------------- Since there is BQL, the interface to manipulate packages is very easy to define. The only thing that the GUI would have to do is provide a valid BQL string. I think there need to be two methods (in java, other apis will be similar): int pretendBql(String query); int executeBql(String query); The first method returns detailed information about what would happen if the query manipulates the database. It can for example be used to give users a quick overview and ask for confirmation. The second method would actually execute the query. The integer return values provide a unique positive number to identify the executing query, -1 indicates that the provided BQL string was not valid. Note that these commands don't wait for the query to be terminated, they merely start a new process and return immediately. Independent of that, the most difficult part is to make a GUI responsive and make it possible to provide the required information to the user. Therefore a simple system of listeners allows a GUI to 'subscribe' to notifications about the advancement of the execution. The following method registers a listener for a certain query. void addListener(int queryId, BqlListener callback); The BqlListener class has the following interface: public interface BqlListener { void result(int queryId, String content); void advance(int queryId, int position, int end); void notice(int queryId, String text); void warning(int queryId, String text); void error(int queryId, String text); void abort(int queryId); void finish(int queryId); } When a query finishes, the finished event is triggered and all listeners are automatically removed. Additionally, the GUI is able to manage the execution of a query through the following API: boolean isRunning(int queryId); boolean requestAbort(int queryId); Support for alternate download hosts (mirrors) ---------------------------------------------- Packages can use mirror aliases when specifying download locations. These aliases need to be defined beforehand in a repository configuration file, but can be overridden by the user in local system-wide settings. Compatibility indicators ------------------------ COMPATIDS indicate which package ids provide exactly the same functionalities as others. This makes it possible for old installations to be safely removed without removing any functionality from the system. It's a safe way to clean away the presence of package ids in the database that have no use anymore. The indicators are needed since it requires human intervention to determine which package ids provide the same features as others since this is rarely directly derivable from version information. Compatibility indicators solve two typical package management problems that occur during the process of deciding which packages are outdated and obsolete, namely : several incompatible versioning streams within the same package and future package grouping. versioning streams .................. When applications or libraries are installed, they provide certain functionalities. These functionalities are typically compatible when newer version come out. However, when major version changes occur, this functional compatibility is often not preserved. Examples : gtk+-1.2.10 <> gtk+-2.0.5 autoconf-2.13 <> autoconf-2.53a A system needs these version to be able to co-exist without them being removed when the package manager clears out old versions. Therefore, compatibility indicators are introduced, they are a bit like seperate versioning streams within the same package. Let's assume that the following package are installed : gtk+-1.2.9 (COMPATIDS=GTK+-1.2) gtk+-1.2.10-r1 (COMPATIDS=GTK+-1.2) gtk+-1.2.10-r2 (COMPATIDS=GTK+-1.2) gtk+-2.0.4 (COMPATIDS=GTK+-2) gtk+-2.0.5 (COMPATIDS=GTK+-2) autoconf-2.12 (COMPATIDS=AUTOCONF-2.1) autoconf-2.13 (COMPATIDS=AUTOCONF-2.1) autoconf-2.53a (COMPATIDS=AUTOCONF-2.5) Using the information above, the packages are grouped in the following versioning streams by Bagheera : gtk+ (COMPATIDS=GTK+-1.2) gtk+ (COMPATIDS=GTK+-2) 1.2.10-r2 2.0.5 v v 1.2.10-r1 2.0.4 v 1.2.9 autoconf (COMPATIDS=AUTOCONF-2.1) autoconf (COMPATIDS=AUTOCONF-2.5) 2.13 2.5 v 2.12 When determining what the old version are that need to be cleared, Bagheera looks at the most recent package in each version stream and considers the other as obsolete. The following packages will thus remain installed : gtk+-1.2.10-r2 gtk+-2.0.5 autoconf-2.13 autoconf-2.5 and these will be removed : gtk+-1.2.10-r1 gtk+-1.2.9 gtk+-2.0.4 autoconf-2.12 While Bagheera accepts any string as a compatibility id, it's recommended to make it reflect the first release of a package that became incompatible with the prior versions. Subsequent compatible releases can then use the same compatibility id. future package grouping ....................... Consider the following scenario where a standalone package, kopete, gets included into the kdenetwork package as part of a group of other applications. Thus, the following happens : install kdenetwork-3.1 install kopete-0.6.1 upgrade kdenetwork-3.2 (which now includes kopete 0.6.2) clean In this case the standalone kopete should be removed since the new kdenetwork contains a part that is compatible with kopete and also newer. The following compatibility ids should thus be declared : kdenetwork-3.1 (COMPATIDS=kdenetwork-3) kopete-0.6.1 (COMPATIDS=kopete-0) kdenetwork-3.2 (COMPATIDS=kdenetwork-3 kopete-0) With this information, it's possible for Bagheera to figure out that after the upgrade to kdenetwork-3.2, kopete is allowed to be cleaned away. The cleaning logic is a bit more complex though since a package can only be cleaned if there are newer versions available for *all* the compatibility ids that are specified. Consider this theoretical situation: install kdenetwork-3.1 install kopete-0.6.1 upgrade kdenetwork-3.2 (which now includes kopete 0.6.2) install kopete-0.6.3 (newer standalone version) clean We thus have the following definitions : kdenetwork-3.1 (COMPATIDS=kdenetwork-3) kopete-0.6.1 (COMPATIDS=kopete-0) kdenetwork-3.2 (COMPATIDS=kdenetwork-3 kopete-0) kopete-0.6.3 (COMPATIDS=kopete-0) The clean should again only remove the kopete-0.6.1 package and of course not kdenetwork-3.2 since the "kdenetwork-3" compatibility id is not provided by a more recently installed package. Several package formats ----------------------- To support easy installation from source, binary and in a disconnected setup, the following package formats will be supported : The baggy itself : .bag The baggy with the required source that can be installed without having to download anything : .src.bag The binary baggy : .bin.bag Optionally, the .bag extension can be preceeded with an identifier that starts with a dot and that will be completely ignored by bagheera. This can for example be used by distributions to clear add a differentiation to the names of their packages. For example : foo-1.0.pyxa.bag foo-1.0.bin.papua.bag foo-1.0.src.pyxa.bag Console package tool names and arguments ---------------------------------------- [ TODO ] The package manager : bag with symlinked shortcut tool names for major actions, for example: bagsync bagfetch baginst bagupdate bagremove bagclean The BQL console : bql Dependency specification ------------------------ Packages are able to specify which other packages they depend upon. Bagheera will ensure that these packages are installed before the package that requires them. The dependency specification includes the following syntaxes: * Basic package and version specification >=foo-2.0 foo 2.0 or later =foo-2.0 foo 2.0 exactly =foo-2.0* any foo 2.0.x <=foo-2.0 foo 2.0 or previous ~foo-2.0 any package revision of foo 2.0 =foo-2.0-r1 an absolutely referenced package * Use flag integration png? ( >=foo-2.0 ) depend on foo-2.0 or later if png is in the use flags * Extended version specification foo[ >=2.0 <=2.2 ] (these can be all of >=V, =V, =V*, <=V and ~V operators) any foo >=2.0 <=2.2 foo[ -readline png ] foo without readline use flag, and with png use flag foo[ gnome? ( gtk+ ) ] if the gnome use flag is used, then foo must be installed with gtk+ use flag foo[ =1.0 -readline | =2.0 ] (| operator to conditionally combine the modifiers above) either foo 1.0 without readline, or foo 2.0 exactly (with or without readline). * Mutual exclusive packages !foo cannot co-exist with foo * Combining dependencies ( =foo-2.0* <=fum-3.0 ) | ( =fab-4.0-r1 ) Dependencies can be specified in series to set a collection of dependencies that need to be resolved all together. It's possible to group several dependencies with () and specify alternative sets with the | operator. Support for SCM packages (CVS, Bitkeeper, ...) ---------------------------------------------- A package can have a version, a scm name element, or both. The order of version precedence is as follows: foo-scm > foo-2.0 > foo-2.0-scm > foo-1.0 > foo-1.0-scm A scm package without a version gets its source from repository HEAD, so it's always newest. When it is installed, however, it is automatically given the version number of the latest non-scm baggy in existence. If only the scm baggy exists, it gets version 0. That way, when a new non-scm package is added with a greater version, it serves as an update. A scm package with a version gets code from a specific tag. Source control tags can be used for several purposes : they indicate the release of a version or they indicate the differentiation of a branch. The version release is not handled by Bagheera since there are source archives available for those, it's thus only sensible to handle different branches of development sources. The source control files are thus always earlier in version than the released version. Because a scm package is always newer than itself when its installation is finished, it wont be considered for a world update. Instead, a dedicated commandline argument can be specified which will allow Bagheera to specifically update scm packages (bagupdate --scm package). The same argument can also be used to specifically install the latest scm version of a package (baginst --scm package). The actual retrieval of the source files is specific for each source control tool, the naming of the tags is different amongst projects and the branching policy can vary widely too. Therefore, it's not possible to extract and isolate the commands to fetch the sources outside the actual packages in a general scheme. It's up to the package build script to execute the required commands which will retrieve the required sources to the correct directory. Flattened package structure with category keywords -------------------------------------------------- The package directory structure will be flat (ie. all packages are in the same directory). A category structure will be provided, based on keywords. Something like CATEGORIES="development c editor" for a C IDE. Categories can not just be set without the system knowing about them, therefore there has to be a global 'categories' file that contains all the categories that are being use throughout a repository. Look at this as prior declaration of valid category keywords. Binary package USE flag support ------------------------------- Developers have a package creation tool at their disposal. Amongst other things it will require a developer that submits binary packages to try out all supported use flag permutations. Since baggies have to declare use flags that they depend on before they can use them, it's not difficult to go over all combinations. A binary package would them be signed with a unique validation id which gives an indication of the quality and conformance of the package. Binary packages are simple tarballs that are installable on any system by just unarchiving the tarball. However, the tarball contains a very well structured set of files together with all the required package meta-data. Instead of using binary diffs in a special format, Bagheera works differently. The base binary package is the one without any active use flags. This can be used as a reference for all other variations. When a new use-flag variation is compiled, it's compared with the reference and a list of removed, modified and added files is created. Only the modified and added files are preserved and stored together with the difference information in a different directory. The directory structure of a binary package will thus indicate the differences. For example for foo-package, when you untar it, you'll get. /var/tmp/install/foo-package-1.0/foo-package-1.0.bag /var/tmp/install/foo-package-1.0/base/info /var/tmp/install/foo-package-1.0/base/info/CONTENTS /var/tmp/install/foo-package-1.0/base/files/usr/bin/foo /var/tmp/install/foo-package-1.0/base/files/usr/lib/libfoo.so.1.0.0 /var/tmp/install/foo-package-1.0/base/files/usr/man/man1/foo.1 /var/tmp/install/foo-package-1.0/use1/info /var/tmp/install/foo-package-1.0/use1/info/CONTENTS /var/tmp/install/foo-package-1.0/use1/files/usr/lib/libfoo.so.1.0.0 /var/tmp/install/foo-package-1.0/use2/info /var/tmp/install/foo-package-1.0/use2/info/CONTENTS /var/tmp/install/foo-package-1.0/use2/files/usr/bin/varfoo /var/tmp/install/foo-package-1.0/use1_use2/info /var/tmp/install/foo-package-1.0/use1_use2/info/CONTENTS /var/tmp/install/foo-package-1.0/use1_use2/files/usr/bin/varfoo The CONTENTS file would then contain the meta-data about the differences, the exact format still has to be defined. Bagheera would detect the current active use flags and combine the unarchived files and information to create the appropriate version of the binary package. Package creation tool --------------------- Before shipping a package, reference builds have to be made with all possible USE flag permutations. Thanks to the sandbox, all read accesses are auto-detected and monitored. The resulting file list can be matched against the database of package contents and the dependencies can be auto-generated for each package. The automatically generated deps provide a good starting point for package writers to fine-tune and tailor them to a suitable specification. The reference builds also makes it possible to automatically generate the list of installed files for each USE flag combination, the binary installation size, USE flag aware binary packages, etc ... The package creation tool bases itself on existing templates for package types to automatically include reusable functionalities. This neatly removes the use of an object-oriented packaging scheme where common functionality is isolated. The latter has the huge disadvantage that packages become unreadable and are not simple standalone bash scripts anymore. The template approach allows developers to reuse common functionalities easily though. This package creation tool can also contain additional checks to ensure that all required variables are provided and that the baggy doesn't contain common mistakes A finished package can be validated by the creation tool to indicate that the provided meta-data is correct and current. A unique, signed validation id makes it possible to get a direct indication of the quality of the package. Bottom-first upgrade paths -------------------------- When upgrading a collection of package with their dependencies, the deepest packages need to be upgraded first to ensure that the higher packages are linked against the new versions of the dependencies. Upgrade rules ------------- While dependencies are a good way to conceptually describe the preconditions that need to be available before a package can be installed, they don't provide enough information to correctly upgrade installed packages. If the 'bottom-first' upgrade strategy needs to be applied as-is, an upgrade of foo would cause the whole system to be upgraded. This is clearly unacceptable. Therefore, upgrade rules are introduced. These are automatically executed after the installation of a package to determine what the requirements are that have to be met to consider the environment valid for the usage of the package. Such rules for example determine the actual names of the libraries that installed executables are linked against. If the upgrade of a dependency library provides the same required files, the package that depends on it doesn't need to be recompiled, otherwise the recompilation has to be executed. This makes it possible to traverse all dependency branches entirely from the bottom to the top and correctly skip packages that don't need to be upgraded. Sensible package uninstallation ------------------------------- When a package is installed, it stores additional information which indicates whether the package with installed explicitly or not. If it was installed explicitly, it can only be removed manually. If it wasn't installed explicitly (thus automatically), it will be removed automatically too if and only if it's only present once in the entire inheritance tree of a package that was explicitly installed. Again, with an example : baginst gnome -> installs x11, glib, gtk+ and gnome x11 : explicit 0, in dep tree of expl. packages (gnome) glib : explicit 0, in dep tree of expl. packages (gnome) gtk+ : explicit 0, in dep tree of expl. packages (gnome) gnome : explicit 1 baginst kde -> install qt and kde (however qt and thus kde relies on x11) x11 : explicit 0, in dep tree of expl. packages (gnome, kde) glib : explicit 0, in dep tree of expl. packages (gnome) gtk+ : explicit 0, in dep tree of expl. packages (gnome) gnome : explicit 1 qt : explicit 0, in dep tree of expl. packages (kde) kde : explicit 1 baginst qtunit -> install qtunit (however qtunit relies on qt) x11 : explicit 0, in dep tree of expl. packages (gnome, kde, qtunit) glib : explicit 0, in dep tree of expl. packages (gnome) gtk+ : explicit 0, in dep tree of expl. packages (gnome) gnome : explicit 1 qt : explicit 0, in dep tree of expl. packages (kde, qtunit) kde : explicit 1 qtunit : explicit 1 Then, when a user wants to remove gnome, Bagheera finds that it's the only explicitly installed package that has glib and gtk+ in its dependency tree. Thus, gnome, glib and gtk+ will be removed. x11 : explicit 0, in dep tree of expl. packages (kde, qtunit) qt : explicit 0, in dep tree of expl. packages (kde, qtunit) kde : explicit 1 qtunit : explicit 1 Then when a user wants to remove kde, Bagheera finds that no automatically installed package is part of only the dependency tree of kde. Thus, only kde is removed. x11 : explicit 0, in dep tree of expl. packages (qtunit) qt : explicit 0, in dep tree of expl. packages (qtunit) qtunit : explicit 1 Now when a user removes qtunit, Bagheera finds that it's the only explicitly installed package that has x11 and qt in its dependency tree. Thus, x11, qt and qtunit will be removed. Users have the option to subscribe to implicitely installed packages by setting their status to explicit manually. This allows developers or people that install software manually from source, to indicate which libraries or tools are required and prevent Bagheera from removing them. Package boundaries ------------------ Not all packages have the same importance, some packages are more fundamental than others. Packages are able to declare a BOUNDARY variable. If it's set to 'explicit', Bagheera will always treat the package as being explicitely installed, even if it was installed through the dependency tree of another package. If x11 (and binutils, glibc, gcc, ...) thus declares this variable, they will never be automatically removed. This thus defines clear 'boundaries' in package installation which, once they're crossed, must be manually removed. If someone for example has a console-only system and installs kde he'll cross the x11 boundary which will never be removed automatically. To prevent this behaviour to pass by unnoticed, Bagheera will emit a notification when a boundary is crossed so that it will be able to display a clear message to the user. As an example of it's applicability, consider the following scenario : install kde (depends on x11, which is implicitely installed) remove kde (all free implicitly installed packages are removed) install gnome (depends on x11, which is implicitely installed) Without boundaries, this would cause x11 to be removed and reinstalled immediately afterwards, this is clearly not wanted since people that have x11 available don't want it to automagically disappear. People are well aware of its existance and will find it logical to have to removed it manually. Safe removal of packages ------------------------ Bagheera refuses to remove installed packages that are being used by other packages. This is however checked for all packages in the dependency tree of a package that is about to be removed. If at least one package is still used by a package that's not part of the dependency tree of the package that you want to remove, the whole operation is aborted and a summary of conflicts is provided to the user. For an example, you installed Kde, Qt and Gnome explicitly (in that order), here's the summary : x11 : explicit 0, in dep tree of expl. packages (gnome, kde, qt) glib : explicit 0, in dep tree of expl. packages (gnome) gtk+ : explicit 0, in dep tree of expl. packages (gnome) gnome : explicit 1 qt : explicit 1, in dep tree of expl. packages (kde) kde : explicit 1 When removing Qt, Bagheera will of course refuse the operation since Kde relies on it. Only when removing Qt and Kde together, the system will proceed. Somewhere in my mind however I feel that it's needed to force the removal behavior so that what I explain above doesn't abort the operation. I'm not sure it's a good idea to allow this though. This example also makes it possible to demonstrate another aspect. Since Qt was explicitly installed afterwards, it will never be removed automatically when Kde is removed. Safe removal of outdated packages --------------------------------- When a clean operation is executed, outdated packages are automatically and safely removed from the system. The packages that can be removed are determined by comparing the compatibility ids of the packages. If several packages have the same compatibility ids, only the newest installed version is preserved and the older ones are removed. Note that new and old here really are based on time and not on version information. Since it's possible that a lesser version is installed manually after a bigger version, it's important that the date of the installation is used as a guideline. Otherwise, the files of the package version that's actually in use could be accidentally erased. For example, consider the following packages : gtk+-1.2.9 (COMPATIDS=GTK+-1.2) gtk+-1.2.10-r1 (COMPATIDS=GTK+-1.2) gtk+-1.2.10-r2 (COMPATIDS=GTK+-1.2) gtk+-2.0.4 (COMPATIDS=GTK+-2) gtk+-2.0.5 (COMPATIDS=GTK+-2) Which are installed in this order : gtk+-1.2.9, gtk+-1.2.10-r2, gtk+-2.0.5, gtk+-1.2.10-r1, gtk+-2.0.4 If the version information would determine the newest packages, these packages would be removed : gtk+-1.2.9 gtk+-1.2.10-r1 gtk+-2.0.4 However, since gtk+-1.2.10-r1 and gtk+-2.0.4 were the last versions that were installed this would cause all useful gtk+ files to be removed from the system. The packages that should be removed are : gtk+-1.2.9 gtk+-1.2.10-r2 gtk+-2.0.5 This is thus based on the last installation times. Timed source installation ------------------------- All source installations will be timed and the information will be available for retrieval. This can be an interesting guideline for people that want to install a new package from source. Eventually, hooks could be added to send the architecture, optimization parameters, package, use flags and timing to a central server. Typed search on all known packages ---------------------------------- Bagheera is able to search all information of both the installed and available packages in a typed fashion since the database contains all this information in a structured manner (see BQL). It should be noted that all packages (also source packages) need to declare which files they are going to install for which USE flag parameters. To assist developers in doing this, a dedicated developer tool will be available. Thanks to this information, it will be easy for a user to find which package provides a particular file or tool. License acceptance ------------------ All licenses are processed before installation and not during the installation process. The user will thus see a clear list of all licenses that still need to be accepted with their related packages. There will be an interface to perform the acceptance task in a comfortable way. In the summary of the packages that will be installed, it's clearly shown which licenses are used by each package even if they've already been accepted. If a license has never been accepted, the user can read the license text and has to explicitely accept it. When this license is encountered later on and it has been accepted beforehand, it is implicitely accepted again. Packages can embed license texts for not-general licenses or proprietary packages, these licenses have to be accepted every time the package is encountered. License groups -------------- Besides individual licenses, license groups are available to automatically accept a group of conceptually equal licenses. For example the OSI identifier accepts all the licenses that have been approved as being open-source. Package-specific USE flag preservation -------------------------------------- After successful installation, Bagheera stores two use-flag fields : 'globaluse' and 'localuse'. Now when this package is reinstalled, Bagheera compares the stored 'globaluse' value with the currently active use flag settings. The following options are handled: * 'globaluse' and the active use flags are the same 'localuse' is used for the installation / upgrade * 'globaluse' and the active use flags differ - 'globaluse' and 'localuse' are the same the active use flags are used for the installation / upgrade - 'globaluse' and 'localuse' differ Bagheera asks the user what to do, either to use 'localuse' either to use the active use flags. Rollback functionality ---------------------- Optionally it's possible to turn on a rollback functionality which maintains a seperate repository of the unrestorable operations that were executed during the installation of a package (files overwritten, symlinks changed, config files changed, etc etc). This make it possible for a user to go back to any moment in time in case he finds out that the current state of his system isn't satisfactory anymore. It's also possible to intuitively browse the changes in a descending order according to the operation date, ie. : Tuesday 18 June, 2002 10:51:43 Installed kdelibs-3.0.1.20020604 10:14:26 Installed gtkmm-1.2.9-r2 09:57:25 Installed ltrace-0.3.26 09:50:10 Removed gtk+-2.0.1-r2 Monday 17 June, 2002 23:15:09 Installed gtk+-2.0.3-r1 22:51:19 Injected glib-2.0.3 Then a user can say, put me back in the state of Monday 17 June, 2002 at Midnight for example. Bagheera will remove kdelibs-3.0.1.20020604, gtkmm-1.2.9-r2, ltrace-0.3.26 and restore gtk+-2.0.1-r2. Of course this will require additional disc-space, but the users know this and have to enable this feature explicitly (or a distribution might opt to turn it on by default of course). Users will have the opportunity to clear the rollback database entirely or up to a certain date if they want to recuperate hard disk space. Bagheera Query Language ----------------------- The Bagheera Query Language (BQL) is the core engine of the whole Bagheera package manager. It should provide a language construct for every low-level operation that the package manager has to perform. Note that the BQL is not the sole way of accessing interfacing with the database, all language constructs will in fact be mapped to action classes that can be called directly through the programming language API for tighter integration (this is what the core tools of Bagheera will do). The BQL language constructs basically are split up in two parts : ACTION PACKAGE_SELECTOR. The action identifies what has to be executed and the package id identifies which packages this has to be executed upon. The PACKAGE_SELECTOR is identical to the dependency format and makes it possible to select one specific package id (name-version-revision), a collection of package ids or the previous two according to a defined use var. It contains a number of virtual packages such as WORLD and SYSTEM. The actions include information retrieval, package installation, removal and updating. This warrants an extension to the package selector format, namely the specification of the repository to access. The format thus becomes : repository:[operator]package-version-revision[operator] Two virtual repository names are available : AVAILABLE and INSTALLED which of course map to the database of available and the database of installed packages. The available repository corresponds to the local merge of all remote repositories that was executed during the last SYNC action. SELECT ...... For the information retrieval I thought of inspiring myself on SQL, thus for example: SELECT DEPENDENCIES FROM PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... will provide a matrix of package ids and a list of all the packages they depend on. Of course more information can be obtained simultaneously by grouping select targets : SELECT target1, target2 FROM PACKAGE_SELECTOR [ WHERE where_operation] To make queries easier, the following shortcuts are defined because they are the most sensible defaults and will be used an awful lot "SELECT FROM ..." is the same as "SELECT PACKAGEID FROM ..." "SELECT WHERE ..." is the same as "SELECT PACKAGEID FROM WORLD WHERE .." These are the current targets that I've identified for SELECT : CATEGORIES the categories to which the package id belongs NAME the package name of the package id VERSION the version number of the package id REVISION the revision of the package id REPOSITORY the repository to which the package id belongs COMPATIDS the compatibility ids of the package CONTENTS the contents of the package (the files that are installed) SOURCE the complete source file of the package CHANGED the contents of the package that are missing or changed (only makes sense for the INSTALLED repository of course) SIZE the space that is occupied by the binary installation of the package (also has to be provided with the extended submission procedure) CDEPEND, RDEPEND the packages that this package depends on This is handled differently according to the AVAILABLE or INSTALLED repository name. When querying the installed repository, you get the exact package ids that are being depended upon. For the available repository this could be several things : the exact dependency string or the matching package ids to this dependency string, where valid installed package ids are preferred to the all the possible available options. Which one is best, I don't know yet (I tend to prefer the last one) USING the packages that are depending on a package This is only possible for the INSTALLED repository (imho it doesn't make sense for the AVAILABLE repository at all, or does it?) USES the use flags that are being used by the package This is handled differently according to the AVAILABLE or INSTALLED repositories. The available repository returns *all possible* use flags that this package understands, and the installed repository returns only the use flags that have been used for the installation. LICENSES the licenses under which the package is being distributed DISPLAYNAME the name of the package as it should appear on screen SUMMARY the short description of the package DESCRIPTION the long description of the package HOMEPAGE the homepage of the package MAINTAINER the mantainer of the package ORGANIZATION the organization that issues the package BOUNDARY the boundary that a package enforces CHANGELOG the history of changes of the package PROVIDES the virtual packages the package provides DOWNLOADS the definition of files that a package needs to download DOWNLOADED the list of files that a package has downloaded during installation (the AVAILABLE repository returns none) MISSINGDEPS the missing dependencies that have to be installed before this package can be installed (the INSTALLED repository returns none) DATE the INSTALLED repository returns the date when the package was installed and the AVAILABLE repository returns that date when this package package has been released. CFLAGS, CHOST, CXXFLAGS the flags that were used to compile this package, the AVAILABLE repository either returns nothing or the flags that would be used, don't know yet. PACKAGEID get the package ids that match the provided package selector OUTDATED get a list of outdated package ids, these are the packages that can certainly safely be removed (this only works for the INSTALLED repository, doesn't make sense for the AVAILABLE repository) CURRENT get a list of current package ids, these are the packages that are being used by the system and that should certainly not be removed (this only works for the INSTALLED repository, doesn't make sense for the AVAILABLE repository) OMITTED get all package ids within the same package that don't match the provided package selector (= negative selection) BLOCKING gets a list of package ids from the AVAILABLE repository that will not be installed if the provided packages would be installed (note that this behaves exactly the same for the INSTALLED or AVAILABLE repositories) INJECTED gets the list of injected packages from the INSTALLED repository (this only works for the INSTALLED repository, doesn't make sense for the AVAILABLE repository) SUBSCRIBED get the list of explicitely installed packages from the INSTALLED repository (this only works for the INSTALLED repository, doesn't make sense for the AVAILABLE repository) The WHERE statement acts on the same targets as the ones that SELECT can return. However, instead of returning the information, it searches the targets for matches of a provided search expression and only considers the matching package ids for the final select information retrieval. The search expressions are constructed as follows : Since the SELECT targets typically come in two different kind a string or a (multi-dimensional) list. Strings can be considered being lists too (a list of characters), the scope can thus be abstracted to working with lists only. The most basic search expression are : IS an exact match for equality IN a check for containance This thus brings us to the following example : SELECT WHERE '/bin/bash' IN CONTENTS This gets all the package ids of the packages that contain the /bin/bash file. Another example : SELECT WHERE LICENSE IS 'GPL' This gets only the package ids that can be used according to the GPL license. One more example : SELECT CATEGORIES, NAME FROM WORLD WHERE 'Gtk' IN DESCRIPTION This gets all the categories and package names for which Gtk can be found in the description. A last example : SELECT WHERE PROVIDE IS ('virtual-mail', 'virtual-smtp') This gets all package ids that provide exactly the virtual-mail and virtual-smtp virtual packages. This example is pure academical and I haven't found a real use yet to exactly match on a complete list. Maybe someone else has an idea, otherwise this will most probably not be implemented. Sadly, this syntax alone is not sufficient. It also has to be possible to check for more intelligent matches. Therefore, the WHERE statement also accepts a number of predefined functions and operators: EXISTS select_target this returns a valid match if the following select target returns a value UPDATEABLE this only works on INSTALLED repository package selectors and is true when the AVAILABLE repository contains new versions of the matching packages NOT where_statement this negates any where statement that follows where_statement AND where_statement where_statement OR where_statement this adds boolean operators that can be used to use a number of where statements in conjunction with each-other Examples: SELECT WHERE EXISTS MISSINGDEPS This gets all packages for which missing dependencies have to be installed before the packages themselves can be installed SELECT FROM qmail WHERE EXISTS BLOCKING This gets all packages that will not be installable if qmail would be installed. SELECT WHERE NOT EXISTS USING This gets all package ids of packages that are not being used by other packages. SELECT WHERE 'gnome-libs' IN DEPEND AND UPDATEABLE This gets all package ids of packages that depend on gnome-libs and for whom new versions are available in the available repository. FETCH ..... The only download the required archives for package installation, the follow syntax is used : FETCH PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... According to the repository from which the package comes, either the source of binary packages are retrieved. INSTALL ....... To install a package, the following syntax is used : INSTALL PACKAGE_SELECTOR, ... [ FROM SOURCE|BINARY ] [ USE [-]FLAG1, ...] Using the AVAILABLE repository installs a new package, but the INSTALLED repository re-installs an existing package. The FROM SOURCE option is implied and the FROM BINARY fails when no binary archive could be found. Temporary use flags can be specified for this installation. Optionally, temporary use flags can be specified to tailor the behaviour of the packages. REMOVE ...... To remove a package, the following syntax is used : REMOVE [ FORCE|NODEPS|USING ] PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... This is of course only possible from the INSTALLED repository. It fails if other packages are still depending on the packages that are going to be removed. Use the FORCE option to forcible remove a single package, even when it's still being used by other packages. Use NODEPS to only remove the specified packages without the free dependencies. Use USING to remove all the depending packages too (quite dangerous). SELECT USING FROM PACKAGE_SELECTOR1, PACKAGE_SELECTOR2 gives an overview of all packages that are depending on the packages that are going to be removed. CLEAN ..... To clean outdated packages, the following syntax is used : CLEAN PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... This safely removes old packages for which newer installations exist that are binary compatible. CHECK ..... The check the integrity of a package the following command is used : CHECK PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... Checks whether the provided packages are complete and unchanged, returns 1 or 0 for each matching package id. This is of course only valid for the INSTALLED repository. SELECT CHANGED FROM PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... can be used to obtain details. PACKAGE ....... To create binary packages the following command is used : PACKAGE PACKAGE_ID1, PACKAGE_ID2, ... [ AS BAGHEERA|RPM|DEB ] This is again only possible against the INSTALLED repository and makes it possible to create binary package in a variety of formats. SYNC .... The local repository can be synchronized to and from other repositories: SYNC [ AVAILABLE|INSTALLED ] [ TO repository ] [ FROM REMOTE|repository ] By default this creates the most up-to-date version of the local repository according to the currently defined list of remote repositories, this is the same as SYNC AVAILABLE FROM REMOTE. Besides the previous behaviour, this will currently only be able to export / import a repository to a text/directory version of the binary database (SYNC INSTALLED TO text:/var/db/pkg, SYNC INSTALLED FROM text:/var/db/pkg). This can be used to backup the database, to edit it manually, to launch basic *nix utilities (grep, sed, awk) on it, to migrate from portage to Bagheera, and more. Later on it might be interesting to write push drivers for each repository format to 'publish' the local repository to a remote repository. For the moment this is however far out of reach. SYNC INSTALLED FROM REMOTE is also impossible. INJECT ...... INJECT PACKAGEID This makes a void entry for the specified package id in the INSTALLED repository. It can come in handy when you need to force that certain dependencies have been met by third-party packages. SELECT INJECTED FROM PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... can be used to get a list of injected packages. SUBSCRIBE ............ SUBCRIBE [ ON|OFF ] PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... Sets packages as being explicitely installed or not. Explicitely installed packages will never automatically be removed when their last depending package is removed. SELECT SUBSCRIBED FROM PACKAGE_SELECTOR1, PACKAGE_SELECTOR2, ... can be used to get a list of explicitly installed packages. ROLLBACK ........ ROLLBACK [ LIST [timestamp] | EXECUTE timestamp | CLEAR [timestamp] ] Accesses the rollback functionality. Use LIST with an optional timestamp argument to obtain the list of recorded actions since the provided timestamp or since the beginning if the timestamp wasn't provided. Use EXECUTE to roll the state of the installed packages back to the provided timestamp. Use CLEAR with an optional timestamp argument to remove all the rollback information until, and including, the provided timestamp. If no timestamp was provided, all rollback information is erased. Easy import/export to ascii format ---------------------------------- Bagheera is able to export the required information in the internal database to text files in a predefined directory hierarchy. This should make it possible to consult and manipulate the contents with regular unix tools. Afterwards, Bagheera is able to import all those files again. [ TODO : the exact structures and data still have to be defined. ] Side-by-side installation of several versions of the same package ----------------------------------------------------------------- [ TODO : not really solved since there can be shared files that don't follow the prefix approach, I propose to leave this for later ] Has been discussed here : http://www.uwyn.com/pipermail/bagheera/2002-June/000076.html Support for signed packages for security ---------------------------------------- [ TODO : still needs to be designed. ]