Definition

\nabla_{\theta}J(\theta) &= \nabla_{\theta} E_{s\sim \rho_{\mu}}[Q(s, a)|_{a=\mu(s;\theta)}]\\ &= \int_{S} \rho_{\mu}(s)\nabla_{a}Q(s, a)|_{a=\mu(s;\theta)}\nabla_{\theta}\mu(s;\theta)ds\quad\text{(by chain rule)}\\ &= E_{s\sim\rho_{\mu}}[\nabla_{a}Q(s, a)|_{a=\mu(s;\theta)}\nabla_{\theta}\mu(s;\theta)] \end{aligned}$$