This thesis explores a number of parameter estimation techniques for conditional random fields, a recently introduced  probabilistic model for labelling and segmenting sequential data. Theoretical and practical disadvantages of the training techniques reported in current literature on CRFs are discussed. We hypothesise that general numerical optimisation techniques result in improved performance over iterative scaling algorithms for training CRFs. Experiments run on a a subset of a well-known text chunking data set  confirm that this is indeed the case. This is a highly promising result, indicating that such parameter estimation techniques make CRFs a practical and efficient choice for labelling sequential data, as well as a theoretically sound and principled probabilistic framework.