[albatross-users] pagination with al-for

Wed Jan 7 12:01:09 EST 2004

>To recap, the issue is that the current page module loading mechanism is 
>broken for anything non-trivial. When a page module is imported it is 
>registered in sys.modules under its module name. The absence of packages 
>means that the page module name can easily clash with other modules and 
>packages, overwriting entries in sys.modules. That causes all sorts of 
>fun and games ;-).
>
>(Note that Albatross page modules are just Python modules in some 
>directory on disk and that page modules in a directory structure are 
>just Python modules scattered across are number of different directories 
>on disk. There is no package structure.)

Indeed. A page module is a regular python module, and occupies the same
namespace as any other python module, hence the collisions. Arguably,
they should live in their own namespace.

>There have been a couple of suggestions for how to fix the problem:
>
>   1. Hack the import machinery. Instead of importing page modules as
>      is, concoct a package and module name to avoid clashes. I think I
>      found a problem with this approach (something to do with importing
>      builtin packages like os and os.path) but I can't remember the
>      details now. I think I also found a way to avoid the problem.

I've done this before - I don't remember running into any real problems.
I've discarded my earlier implementation, but here's something I just
knocked up - it appears to work as desired:

    import imp, sys

    mod_holder_name = '__snot__'

    def load_page_module(path, name):
        try:
            mod_holder = sys.modules[mod_holder_name]
        except KeyError:
            mod_holder = imp.new_module(mod_holder_name)
            sys.modules[mod_holder_name] = mod_holder
            globals()[mod_holder_name] = mod_holder

        f, path, desc = imp.find_module(name, [path])
        try:
            abs_name = mod_holder_name + '.' + name
            mod = imp.load_module(abs_name, f, path, desc)
            setattr(mod_holder, name, mod)
        finally:
            f.close()

    >>> load_page_module('.', 't')
    >>> __snot__.t.Sigh
    'Hi'

I've played around loading stuff from the "page" module t.py, and various
other things, and it just seems to work. It's probably not necessary or
desirable to poke everything back into globals() in the context of the
Albatross machinery.

Can anyone see any problems with this scheme?

>   2. Force the use of a proper package structure for page modules.
>      Albatross could then import modules in the normal way; developers
>      would have to scatter a page module directory structure with
>      __init__.py files. This all sounds fine but I think it changes the
>      concept of page modules a little. It would also break my code but
>      I don't think it would be too difficult to fix that.

The __init__.py problem only occurs for people who use the package
structure - it should be backward compatible. The other virtue of this
is that complex applications can implement a hierachy of page modules.

>   3. Use execfile() to "import" page modules. This avoids touching
>      sys.modules at all which is great. However, there is a possible
>      (i.e. no one has profiled it yet) performance hit for plain old
>      CGI users because Albatross would no longer benefit from compiled
>      page modules. There is also an issue with storing certain types of
>      object is the session caused by the way the pickle machinery works.

I've measured the hit in simple cases, but as you say, it's insignificant
in the context of normal albatross cgi execution times. Still, it's a
slippery slope...

I think options 1 and 3 both suffer from the pickle problem, but I'd
argue that it's a user application structure shortcoming and we should
not be overly concerned about it.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/