The CT denoising problem is an important problem for improving the quality of CT imaging while simultaneously improving patient safety. Convolutional neural networks have been proposed to solve this problem, however they are very large, subject to unpredictable behaviour, and do not utilize given knowledge of the proposed task. In earlier research, we introduced a reinforcement learning framework to augment the joint bilateral filter by tuning the parameters at each point. In subsequent follow ups, we removed the requirement for the prior image, by introducing a second bilateral filter in the projection domain. We show the possibility of creating an end-to-end CT denoising framework based on reinforcement learning, with only 4 tunable parameters.