In this paper, we propose a Sarsa(&#955;) algorithm based online self-optimizing QoS control framework in the middleware layer to solve the differentiated average response time control problem in distributed services. Compared to other existing solutions, the proposed controller can learn control policy autonomously without the need of explicit domain expert knowledge to optimize the controller manually. We have implemented a prototype of the framework on an existing middleware platform, the Internet Communication Engine (ICE), and conducted comprehensive experiments across a wide range of workload conditions to evaluate its performance. Experimental results show that the Sarsa(&#955;) based controller learns the control policy efficiently and effectively. Compared with a Self-Tuning Fuzzy Controller(STFC) and a Proportional (P) controller, we find that it achieves superior performance than either of these controllers.