python/postgres query

Tim Wegener twegener at fastmail.fm
Wed May 29 09:36:02 CST 2013


Hi Spyro,





On Wed, May 29, 2013, at 01:03 AM, Spyro Polymiadis wrote:

Hi folks,



I have a python file querying a database to return a list of paths with
output like:



/path/to/folder

/path/to/folder/subfolder1

/path/to/folder/othersubfolder2

/path/to/folder2

/path/to/folder/subfolder1

/path/to/folder3

/path/to/folder4

…Etc



which is stored in a variable called 'paths'



How is it possible for me to go through the list – and remove all the
entries with "subfolders"

And only leave me with the "top level" 'folder' path?





#!/usr/bin/env python





def remove_subfolders(folders, sep='/'):

    """Remove folders from that list that are descendants of other
folders.



    Arguments:

    folders -- List of path strings. E.g. ['/some/folder',
'/some/folder2']

    sep     -- Path separator. Default: '/'



    Return a new list with the descendant folders removed.



    """



    folders_set = set(folders)

    filtered_folders = []

    for folder in folders:

        parts = folder.split(sep)

        is_descendant = False

        for num_parts in range(1, len(parts)):

            sub_path = sep.join(parts[:num_parts])

            if sub_path in folders_set:

                is_descendant = True

                break

        if not is_descendant:

            filtered_folders.append(folder)

    return filtered_folders





input_folders = [

    '/path/to/folder',

    '/path/to/folder/subfolder1',

    '/path/to/folder/othersubfolder2',

    '/path/to/folder2',

    '/path/to/folder/subfolder1',

    '/path/to/folder/another/deeper',

    '/path/to/folder3',

    '/path/to/folder4',

    '/path/to/some/folder5',

    ]



filtered_folders = remove_subfolders(input_folders)

print filtered_folders



assert filtered_folders == ['/path/to/folder',

                            '/path/to/folder2',

                            '/path/to/folder3',

                            '/path/to/folder4',

                            '/path/to/some/folder5']











What would be nicer is if there is an option I can add to my select
statement in postgres to only return the right folders.





That sounds tough, since the check for each row would depend on every
other row, and what it needs to check is variable length (in terms of
path components).









Any ideas welcome :)





HTH,

Tim









Cheers

Spyro





--

This message was scanned by ESVA and is believed to be clean.

--

LinuxSA WWW: [1]http://www.linuxsa.org.au/ IRC: #linuxsa on
irc.freenode.net

To unsubscribe or change your options:

  [2]http://www.linuxsa.org.au/mailman/listinfo/linuxsa

References

1. http://www.linuxsa.org.au/
2. http://www.linuxsa.org.au/mailman/listinfo/linuxsa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.linuxsa.org.au/pipermail/linuxsa/attachments/20130529/3a5b61e5/attachment.html 


More information about the linuxsa mailing list