Semantics and planning based workflow composition and execution for video processing
Traditional workflow systems have several drawbacks, e.g. in their inabilities to rapidly react to changes, to construct workflow automatically (or with user involvement) and to improve performance autonomously (or with user involvement) in an incremental manner according to specified goals. Overcoming these limitations would be highly beneficial for complex domains where such adversities are exhibited. Video processing is one such domain that increasingly requires attention as larger amounts of images and videos are becoming available to persons who are not technically adept in modelling the processes that are involved in constructing complex video processing workflows. Conventional video and image processing systems, on the other hand, are developed by programmers possessing image processing expertise. These systems are tailored to produce highly specialised hand-crafted solutions for very specific tasks, making them rigid and non-modular. The knowledge-based vision community have attempted to produce more modular solutions by incorporating ontologies. However, they have not been maximally utilised to encompass aspects such as application context descriptions (e.g. lighting and clearness effects) and qualitative measures. This thesis aims to tackle some of the research gaps yet to be addressed by the workflow and knowledge-based image processing communities by proposing a novel workflow composition and execution approach within an integrated framework. This framework distinguishes three levels of abstraction via the design, workflow and processing layers. The core technologies that drive the workflow composition mechanism are ontologies and planning. Video processing problems provide a fitting domain for investigating the effectiveness of this integratedmethod as tackling such problems have not been fully explored by the workflow, planning and ontological communities despite their combined beneficial traits to confront this known hard problem. In addition, the pervasiveness of video data has proliferated the need for more automated assistance for image processing-naive users, but no adequate support has been provided as of yet. A video and image processing ontology that comprises three sub-ontologies was constructed to capture the goals, video descriptions and capabilities (video and image processing tools). The sub-ontologies are used for representation and inference. In particular, they are used in conjunction with an enhanced Hierarchical Task Network (HTN) domain independent planner to help with performance-based selection of solution steps based on preconditions, effects and postconditions. The planner, in turn, makes use of process models contained in a process library when deliberating on the steps and then consults the capability ontology to retrieve a suitable tool at each step. Two key features of the planner are the ability to support workflow execution (interleaves planning with execution) and can perform in automatic or semi-automatic (interactive) mode. The first feature is highly desirable for video processing problems because execution of image processing steps yield visual results that are intuitive and verifiable by the human user, as automatic validation is non trivial. In the semiautomaticmode, the planner is interactive and prompts the user tomake a tool selection when there is more than one tool available to perform a task. The user makes the tool selection based on the recommended descriptions provided by the workflow system. Once planning is complete, the result of applying the tool of their choice is presented to the user textually and visually for verification. This plays a pivotal role in providing the user with control and the ability to make informed decisions. Hence, the planner extends the capabilities of typical planners by guiding the user to construct more optimal solutions. Video processing problems can also be solved in more modular, reusable and adaptable ways as compared to conventional image processing systems. The integrated approach was evaluated on a test set consisting of videos originating from open sea environment of varying quality. Experiments to evaluate the efficiency, adaptability to user’s changing needs and user learnability of this approach were conducted on users who did not possess image processing expertise. The findings indicate that using this integrated workflow composition and execution method: 1) provides a speed up of over 90% in execution time for video classification tasks using full automatic processing compared to manual methods without loss of accuracy; 2) is more flexible and adaptable in response to changes in user requests (be it in the task, constraints to the task or descriptions of the video) than modifying existing image processing programs when the domain descriptions are altered; 3) assists the user in selecting optimal solutions by providing recommended descriptions.