A key issue in video object tracking is the representation of the objects and how effectively it discriminates between different objects. Several techniques have been proposed, but without a generally accepted method. While analysis and comparisons of these individual methods have been presented in the literature, their evaluation as part of a global solution has been overlooked. The appearance model for the objects is a component of a video object tracking framework, depending on previous processing stages and affecting those that succeed it. As a result, these interdependencies should be taken into account when analysing the performance of the object description techniques. We propose an integrated analysis of object descriptors and appearance models through their comparison in a common object tracking solution. The goal is to contribute to a better understanding of object description methods and their impact on the tracking process. Our contributions are threefold: propose a novel descriptor evaluation and characterisation paradigm; perform the first integrated analysis of state-of-the-art description methods in a scenario of people tracking; put forward some ideas for appearance models to use in this context. This work provides foundations for future tests and the proposed assessment approach contributes to the informed selection of techniques more adequately for a given tracking application context.