从输入文本文件填充二维数组
Populate 2D array from input text file
我有一个数据语料库,其中充满了以下形式的实例:
'be in'('force', 'the closed area').
'advise'('coxswains', 'mr mak').
'be'('a good', 'restricted area').
'establish from'('person \'s id', 'the other').
我想从 .txt 文件中读取此数据,并仅使用单引号内的信息填充二维数组,即
be in [0][0], force [0][1], the closed area [0][2]
advise [1][0], coxswains [1][1], mr mak [1][2]
be [2][0], a good [2][1], restricted area [2][2]
establish from [3][0], person \'s id [3][1], the other [3][2]
^那些数组索引在那里只是作为概念参考,正如我上面所说的,只是单引号中的信息是可取的,例如索引 [0][0] 将是 be in
,索引 [3][1] 将是 person \'s id
但与示例索引 [3][1] 一样,我们的单引号前面可能有反斜杠,不应将其解释为分隔符。
这是我目前所拥有的:
BufferedReader br_0 = new BufferedReader(new FileReader("/home/matthias/Workbench/SUTD/2_January/Prolog/horn_data_test.pl"));
String line_0;
while ((line_0 = br_0.readLine()) != null)
{
String[] items = line_0.split("'");
String[][] dataArray = new String [3][262978];
int i;
for (String item : items)
{
for (i = 0; i<items.length; i++)
{
if (i == 0)
{
System.out.println("first arg: " + items[i]);
}
if (i == 1)
{
System.out.println("first arg: " + items[i]);
}
if (i == 2)
{
System.out.println("second arg: " + items[i]);
}
}
}
}
br_0.close();
我知道我需要这样的东西:
if (the character under consideration == ' && the one before it is not \)
put it into first index, etc. etc.
但是如何让它在下一个分隔符之前停止呢?填充该数组的最佳方法是什么?输入文件非常大,所以我正在尝试优化效率。
您可以使用此正则表达式匹配单引号字符串并支持转义引号:
'(.*?)(?<!\)'
对引号内的字符串使用 matcher.group(1)
。
RegEx Demo
您可以像这样在 Pattern
和 Matcher
中使用正则表达式:
public static void main(String[] args) throws IOException {
String[] stringArr = { "'be in'('force', 'the closed area').",
"'advise'('coxswains', 'mr mak').",
"'be'('a good', 'restricted area').",
"'establish from'('person \'s id', 'the other')." };
int i = 0;
Pattern p = Pattern.compile("'(.*?)'(?![a-zA-Z])");
String[][] arr = new String[4][3];
for (int count = 0; count < stringArr.length; count++) {
Matcher m = p.matcher(stringArr[count]);
int j = 0;
while (m.find()) {
arr[i][j++] = m.group(1);
}
i++;
}
for (int k = 0; k < arr.length; k++) {
for (int j = 0; j < arr[k].length; j++) {
System.out.println("arr[" + k + "][" + j + "] " + arr[k][j]);
}
}
}
O/P :
arr[0][0] be in
arr[0][1] force
arr[0][2] the closed area
arr[1][0] advise
arr[1][1] coxswains
arr[1][2] mr mak
arr[2][0] be
arr[2][1] a good
arr[2][2] restricted area
arr[3][0] establish from
arr[3][1] person 's id
arr[3][2] the other
我有一个数据语料库,其中充满了以下形式的实例:
'be in'('force', 'the closed area').
'advise'('coxswains', 'mr mak').
'be'('a good', 'restricted area').
'establish from'('person \'s id', 'the other').
我想从 .txt 文件中读取此数据,并仅使用单引号内的信息填充二维数组,即
be in [0][0], force [0][1], the closed area [0][2]
advise [1][0], coxswains [1][1], mr mak [1][2]
be [2][0], a good [2][1], restricted area [2][2]
establish from [3][0], person \'s id [3][1], the other [3][2]
^那些数组索引在那里只是作为概念参考,正如我上面所说的,只是单引号中的信息是可取的,例如索引 [0][0] 将是 be in
,索引 [3][1] 将是 person \'s id
但与示例索引 [3][1] 一样,我们的单引号前面可能有反斜杠,不应将其解释为分隔符。
这是我目前所拥有的:
BufferedReader br_0 = new BufferedReader(new FileReader("/home/matthias/Workbench/SUTD/2_January/Prolog/horn_data_test.pl"));
String line_0;
while ((line_0 = br_0.readLine()) != null)
{
String[] items = line_0.split("'");
String[][] dataArray = new String [3][262978];
int i;
for (String item : items)
{
for (i = 0; i<items.length; i++)
{
if (i == 0)
{
System.out.println("first arg: " + items[i]);
}
if (i == 1)
{
System.out.println("first arg: " + items[i]);
}
if (i == 2)
{
System.out.println("second arg: " + items[i]);
}
}
}
}
br_0.close();
我知道我需要这样的东西:
if (the character under consideration == ' && the one before it is not \)
put it into first index, etc. etc.
但是如何让它在下一个分隔符之前停止呢?填充该数组的最佳方法是什么?输入文件非常大,所以我正在尝试优化效率。
您可以使用此正则表达式匹配单引号字符串并支持转义引号:
'(.*?)(?<!\)'
对引号内的字符串使用 matcher.group(1)
。
RegEx Demo
您可以像这样在 Pattern
和 Matcher
中使用正则表达式:
public static void main(String[] args) throws IOException {
String[] stringArr = { "'be in'('force', 'the closed area').",
"'advise'('coxswains', 'mr mak').",
"'be'('a good', 'restricted area').",
"'establish from'('person \'s id', 'the other')." };
int i = 0;
Pattern p = Pattern.compile("'(.*?)'(?![a-zA-Z])");
String[][] arr = new String[4][3];
for (int count = 0; count < stringArr.length; count++) {
Matcher m = p.matcher(stringArr[count]);
int j = 0;
while (m.find()) {
arr[i][j++] = m.group(1);
}
i++;
}
for (int k = 0; k < arr.length; k++) {
for (int j = 0; j < arr[k].length; j++) {
System.out.println("arr[" + k + "][" + j + "] " + arr[k][j]);
}
}
}
O/P :
arr[0][0] be in
arr[0][1] force
arr[0][2] the closed area
arr[1][0] advise
arr[1][1] coxswains
arr[1][2] mr mak
arr[2][0] be
arr[2][1] a good
arr[2][2] restricted area
arr[3][0] establish from
arr[3][1] person 's id
arr[3][2] the other