[albatross-users] Using additional path information as request parameters

Mon Jul 7 23:03:26 EST 2003

>>>>> "Matt" == Matt Goodall <matt at pollenation.net> writes:

Matt> In some cases I would like to have a _random_ app page process a
Matt> request where the request parameters are appended to the URL in
Matt> a path-like manner. For instance, I might have an Alabtross page
Matt> that display a book's details. The conventional way to achieve
Matt> it is to use the following url:

Matt>     /app.py/book?id=123

Matt> What I would rather use is a URL with 123 as the last part of
Matt> the path:

Matt>     /app.py/book/123

Matt> In both cases the book module would process the request.

Matt> The only way I can think of is to use some fairly complex
Matt> mod_rewrite rules (which I prefer to avoid whenever possible)
Matt> which turns book/123 into book?id=123. Does anyone have any
Matt> better idea?

Matt> The reason for all this? It produces search engine friendly
Matt> URLs. Some search engines still crawl pages with request
Matt> parameters.

Matt> A great section for the Wiki would be integrating Albatross with
Matt> Apache and possible IIS if anyone uses it. I'll add that to the
Matt> list.

An approach that might make sense is to bounce the URL up to the
application so it can assist in cracking it.  If you look at the
Application.run() method it is the load_page() method that you should
override:

    def run(self, req):
        '''Process a single browser request
        '''
        ctx = None
        try:
            ctx = self.create_context()
            ctx.set_request(req)
            self.load_session(ctx)
            self.load_page(ctx)
            if self.validate_request(ctx):
                self.merge_request(ctx)
                self.process_request(ctx)
            self.display_response(ctx)
            self.save_session(ctx)
            ctx.flush_content()
        except Redirect, e:
            self.save_session(ctx)
            return req.redirect(e.loc)
        except:
            self.handle_exception(ctx, req)
        return req.status()

This is how the current RandomPageModuleMixin class works.  It
provides a load_page() method that pulls apart the URL to locate a
page module.

You might even be able to bend the RandomPageModuleMixin.load_page()
to suit your needs.

Current RandomPageModuleMixin.load_page() method:

    def load_page(self, ctx):
        # Get page name from request URI
        uri = ctx.request.get_uri()
        page = ''
        try:
            base_path = urlparse.urlparse(self.base_url())[2]
            uri_path = urlparse.urlparse(uri)[2]
            page = uri_path.split(base_path, 1)[1]
        except IndexError:
            pass
        if not page:
            ctx.redirect(self.start_page())
        # [snip]

New method(s):

    def load_page(self, ctx):
        # Get page name from request URI
        uri = ctx.request.get_uri()
        page = self.get_page_from_uri(ctx, uri)
        if not page:
            ctx.redirect(self.start_page())
        # [snip]

    def get_page_from_uri(self, ctx, uri):
        try:
            base_path = urlparse.urlparse(self.base_url())[2]
            uri_path = urlparse.urlparse(uri)[2]
            return uri_path.split(base_path, 1)[1]
        except IndexError:
            return None

Then in your application you are free to implement any kind of URI
splitting you like by overriding get_page_from_uri().  All you have to
do is return something that the load_page_module() method (in
PageModuleMixin) can use to load a page module.

Does any of that make sense?

- Dave

-- 
http://www.object-craft.com.au