Definition
\nabla_{\theta}J(\theta)
&= \nabla_{\theta} E_{s\sim \rho_{\mu}}[Q(s, a)|_{a=\mu(s;\theta)}]\\
&= \int_{S} \rho_{\mu}(s)\nabla_{a}Q(s, a)|_{a=\mu(s;\theta)}\nabla_{\theta}\mu(s;\theta)ds\quad\text{(by chain rule)}\\
&= E_{s\sim\rho_{\mu}}[\nabla_{a}Q(s, a)|_{a=\mu(s;\theta)}\nabla_{\theta}\mu(s;\theta)]
\end{aligned}$$