P6276 [USACO20OPEN] Exercise P
Description
Farmer John has (again) come up with a new morning exercise plan for the cows!
As before, Farmer John’s $N$ cows stand in a line. For each $1\le i\le N$, the $i$-th cow from left to right has ID $i$. He tells them to repeat the following step until the cows are in the same order as they started.
Given a permutation $A$ of length $N$, the cows change their order so that the cow that was the $i$-th from left to right before the change becomes the $A_i$-th from left to right after the change.
For example, if $A=(1,2,3,4,5)$, then the cows return to the same order after a total of one step. If $A=(2,3,1,5,4)$, then the cows return to the starting order after a total of six steps. After each step, the order of the cows from left to right is as follows:
Step 0: $(1,2,3,4,5)$
Step 1: $(3,1,2,5,4)$
Step 2: $(2,3,1,4,5)$
Step 3: $(1,2,3,5,4)$
Step 4: $(3,1,2,4,5)$
Step 5: $(2,3,1,5,4)$
Step 6: $(1,2,3,4,5)$
**Please compute the product, over all $N!$ permutations $A$ of length $N$, of the number of steps needed to return to the starting order.**
Since this number can be very large, output the remainder modulo $M$ ($10^8\le M\le 10^9+7$, and $M$ is prime).
-----
C++ users may use the following code from [KACTL](https://github.com/kth-competitive-programming/kactl/blob/master/content/various/FastMod.h). This algorithm, called [Barrett reduction](https://en.wikipedia.org/wiki/Barrett_reduction), can compute $a \% b$ several times faster than the usual method, where $b>1$ is a constant unknown at compile time. (Unfortunately, we were unable to find a similar optimization for Java.) (Translator’s note: Chinese users may refer to 几种取模优化方法[(译自 min-25 的博客)](https://loj.ac/article/327).)
```cpp
#include
using namespace std;
typedef unsigned long long ull;
typedef __uint128_t L;
struct FastMod {
ull b, m;
FastMod(ull b) : b(b), m(ull((L(1) > 64);
ull r = a - q * b; // can be proven that 0 = b ? r - b : r;
}
};
FastMod F(2);
int main() {
int M = 1000000007; F = FastMod(M);
ull x = 10ULL*M+3;
cout
Input Format
The input contains one line with $N$ and $M$.
Output Format
Output one integer.
Explanation/Hint
#### Sample Explanation:
For each $1\le i\le N$, the $i$-th element of the following sequence equals the number of permutations for which the cows need $i$ steps: $[1,25,20,30,24,20]$. Therefore, the answer is
$1^1\cdot 2^{25}\cdot 3^{20}\cdot 4^{30}\cdot 5^{24}\cdot 6^{20}\equiv 369329541\pmod{10^9+7}$.
**Note: The memory limit for this problem is increased to 512 MB.**
---
For $100\%$ of the testdata, $1\le N\le 7500$.
There are $16$ test points. Test point $1$ is the sample, and the others have the following properties:
Test point $2$ satisfies $N=8$.
Test points $3\sim 5$ satisfy $N\le 50$.
Test points $6\sim 8$ satisfy $N\le 500$.
Test points $9\sim 12$ satisfy $N\le 3000$.
Test points $13\sim 16$ have no additional constraints.
----
Problem setter: Benjamin Qi
Translated by ChatGPT 5