P4535 [CTSC2004] Digit Search
Background
The HURRICANE team recently received a task to search text, namely to match substrings in a long text composed of digits that satisfy specified conditions. The search conditions are described by regular expressions such as ‘(0+10\*1)\*10\*’. The inductive definition of regular expressions is as follows:
1. 0, 1, …, 9, 0\*, 1\*, …, 9\* are regular expressions.
2. If A and B are regular expressions, then (A), A+B, AB, (A)\* are regular expressions.
3. Only expressions constructed in the above ways are regular expressions.
Here, A+B denotes the “or” relation, AB denotes “concatenation,” and (A)\* denotes “repeat” the content of A zero or more times. For example, the regular expression (12+3)(4+5)6\* matches strings that start with one of 124, 125, 34, 35, followed by zero or any number of 6's (for example, the string 12566). The regular expression (1+0)\* matches all strings composed of 0 and 1, or the empty string. If a regular expression cannot match the empty string, it is called nonempty. In this problem, we only consider nonempty regular expressions.
If, at some position in the given text, there exists a substring ending at that position that can be matched by the given nonempty regular expression, then that position is called matchable. The task given to the HURRICANE team is to find all matchable positions. Can you help them complete this task?
Description
Your program needs to produce output according to the given input:
- The input consists of a regular expression defined as above and a long piece of text.
- Based on the input regular expression and text, you need to find all matchable positions in the text.
- Your output should include all matchable positions.
Input Format
The input file regular.in contains a regular expression on the first line and the text to be processed on the second line:
- The first line contains two parts separated by a space:
- A nonnegative integer n ($1 \le n \le 10$), indicating that the set of digits we consider (i.e., digits appearing in the regular expression and the text) is $0, 1, \ldots, n-1$.
- Next is a regular expression, composed of the 4 symbols from {‘(‘, ’)’, ’+’, ’*’} and the digits from $\{0, \ldots, n-1\}$, with length not exceeding 500 characters.
- The second line is a string composed of digits from 0 to $n-1$, which is the text to be processed. The length of this text does not exceed 10,000,000 characters.
Output Format
The output file contains only one line, consisting of integers separated by spaces, output in ascending order, representing every matchable position in the text to be processed.
Explanation/Hint
Explanation: For the sample input, the text to be processed is ’312331’, and only the 5th character position (the underlined one) is matchable. Here, ’33’ in the regular expression ’12333+33’ can match it.
Hint: In the testdata for this problem, in 6 test points the regular expression does not contain ’*’. Among these, in 3 test points, the regular expression consists only of digits and ’+’. In one test point, the text to be processed does not exceed 1,000,000 characters.
Translated by ChatGPT 5