PHP 字符串上的多个子字符串
PHP multiple substr on string
我有一个字符串,为其提供了一个字符串索引。
我正在创建一个读取它的过程,我想知道是否存在一个我忽略或不知道的 php 函数可以更轻松地执行此过程。
$数据:
Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................
FOCF219611 CUSTOMER -0.02 8050 TOOLS & SUPPLIES - SERVICE
FOCF219669 CUSTOMER -14.49 8050 TOOLS & SUPPLIES - SERVICE
$字段索引:
Array (
[0] => 15
[1] => 20
[2] => 12
[3] => 10
[4] => 50
)
将$data
拆分为$headers
数组:
array_push($headers, substr($data, 0, $fieldIndexes[0]));
array_push($headers, substr($data, $fieldIndexes[0], $fieldIndexes[1]));
array_push($headers, substr($data, $fieldIndexes[1], $fieldIndexes[2]));
array_push($headers, substr($data, $fieldIndexes[2], $fieldIndexes[3]));
array_push($headers, substr($data, $fieldIndexes[3], $fieldIndexes[4]));
是否有可以删除部分字符串的函数 - 如 array_shift
删除字符串?
我在想我可以循环 $fieldIndexes
,从字符串的开头提取第一个长度,依此类推,直到字符串为空并将其压缩为 3 行并使其可移植到任意数量的 fieldIndexes?
期望的结果:
Array
(
[HEADERS] => Array
(
[0] => Invoice No
[1] => Sale Type Desc
[2] => Misc Amt
[3] => Misc Acc
[4] => Misc Acc Desc
)
[1] => Array
(
[Invoice No] => FOCF219611
[Sale Type Desc] => CUSTOMER
[Misc Amt] => -0.02
[Misc Acc] => 8050
[Misc Acc Desc] => TOOLS & SUPPLIES - SERVICE
)
)
喜欢这个(因为我在评论里说了)
$str = 'Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................';
$f = fopen('php://temp', 'w+');
fwrite($f, $str);
rewind($f);
$headers = [];
$header = '';
while(false !== ($c = fgetc($f))){
if($c != '.'){
$header .= $c;
}elseif(!empty($header)){
$headers[] = trim($header);
$header = '';
}
}
print_r($headers);
产出
Array
(
[0] => Invoice No
[1] => Sale Type Desc
[2] => Misc Amt
[3] => Misc Acc
[4] => Misc Acc Desc
)
请注意,我这样做时没有使用偏移量,但我在评论中提到了它,我喜欢做这样奇怪的事情。挺好玩的。
当然你也可以这样做得到同样的结果:
$str = 'Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................';
print_r(array_filter(array_map('trim',explode('.', $str))));
但这远非易事。
如果你不喜欢这些古怪的按键,你可以在那个傻瓜上放一个 array_values。
print_r(array_values(array_filter(array_map('trim',explode('.', $str)))));
哈哈,又是一个星期一。
更新
您也可以使用文件流包装器来修复文件以供 CSV 读取。在 PHP5.4(我认为或 5.3)中缺少 SplFileObj fgetcsv
,我用了一个技巧来修补 class... :)
这就是我的观点(但还有很多我不知道的)
$str = 'Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................
somedata .... someother stuff ... foobar ... hello ... world..
';
//pretend this is a real file
$f = fopen('php://temp', 'w+');
fwrite($f, $str);
rewind($f);
$headers = [];
$num_headers = 0;
$i = 1;
while(false !== ($c = fgetcsv($f))){
//if there is only one element assume the delimiter is wrong
if(count($c) == 1){
//you could test the string for multiple delimiters and change
/*
if(strpos($c, '.')){
$regex = '/\.+/'
}else if(strpos($c, '~')){
$regex = '/~+/'
} etc....
*/
//use memory buffer to fix files with .'s but still read them as
//a normal CSV file, php://memory is really fast.
//and this gives us all the parsing benefits of fgetcsv
//you could use any delimiter here you want.
$fixed = trim(preg_replace('/\.+/', ',', $c[0]),',');
$m = fopen('php://memory', 'w+');
fwrite($m, $fixed);
rewind($m);
$c = fgetcsv($m);
}
//trim any spaces, not a bad idea anyway
$c = array_map('trim', $c);
//if no headers use the first line of file as the header
if(empty($headers)){
$headers = $c;
//count them (see below)
$num_headers = count($headers);
continue;
}
//array_combine is a good choice for header => values
//but the arrays have to be the same size
if(count($c) != $num_headers) die("missing dilimter on line {$i}");
$line = array_combine($headers, $c);
//continue with normal csv opperation
print_r($line);
++$i; //track the line number
}
输出
Array
(
[Invoice No] => somedata
[Sale Type Desc] => someother stuff
[Misc Amt] => foobar
[Misc Acc] => hello
[Misc Acc Desc] => world
)
更新
正如我在评论中提到的(在发现它是 HTML 之后)。您可以使用 DOM 解析器。我过去用过的一个是 PHPQuery
现在有点过时了。但这很好,因为您可以使用 jQuery 语法。例如说你有这个
<ul id="title" >
<li>header</li>
<li>header</li>
<li>header</li>
</ul>
你可以用这样的东西找到它(已经有一段时间了,所以如果这是错误的抱歉)
$length = $PHPQuery->find("#headers li")->lenght;
for($i=0;$i<$lenght;++$i){
echo $PHPQuery->find("#headers li:eq($i)")->text();
}
例如,您甚至可以使用 ->attr('href')
提取属性。基本上,您可以利用 HTML 结构并提取您需要的内容,而不是将其转换为文本并尝试删除一堆 "stuff"
干杯!
您可以创建一个像这样的函数来使用块大小进行拆分。
注意:由于 $fieldIndexes
数组中的每个大小不包括列之间的 space,因此我在每个长度 (15+1, 20+1, ...)
中添加了一个
<?php
$headerString ="Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................";
$fieldIndexes = [ 15+1, 20+1, 12+1, 10+1, 50+1];
function getParts($string, $positions){
$parts = array();
foreach ($positions as $position){
$parts[] = substr($string, 0, $position);
$string = substr($string, $position);
}
return $parts;
}
print_r(getParts($headerString, $fieldIndexes));
?>
结果:
Array
(
[0] => Invoice No.....
[1] => Sale Type Desc......
[2] => Misc Amt....
[3] => Misc Acc..
[4] => Misc Acc Desc.....................................
)
我有一个字符串,为其提供了一个字符串索引。
我正在创建一个读取它的过程,我想知道是否存在一个我忽略或不知道的 php 函数可以更轻松地执行此过程。
$数据:
Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................
FOCF219611 CUSTOMER -0.02 8050 TOOLS & SUPPLIES - SERVICE
FOCF219669 CUSTOMER -14.49 8050 TOOLS & SUPPLIES - SERVICE
$字段索引:
Array (
[0] => 15
[1] => 20
[2] => 12
[3] => 10
[4] => 50
)
将$data
拆分为$headers
数组:
array_push($headers, substr($data, 0, $fieldIndexes[0]));
array_push($headers, substr($data, $fieldIndexes[0], $fieldIndexes[1]));
array_push($headers, substr($data, $fieldIndexes[1], $fieldIndexes[2]));
array_push($headers, substr($data, $fieldIndexes[2], $fieldIndexes[3]));
array_push($headers, substr($data, $fieldIndexes[3], $fieldIndexes[4]));
是否有可以删除部分字符串的函数 - 如 array_shift
删除字符串?
我在想我可以循环 $fieldIndexes
,从字符串的开头提取第一个长度,依此类推,直到字符串为空并将其压缩为 3 行并使其可移植到任意数量的 fieldIndexes?
期望的结果:
Array
(
[HEADERS] => Array
(
[0] => Invoice No
[1] => Sale Type Desc
[2] => Misc Amt
[3] => Misc Acc
[4] => Misc Acc Desc
)
[1] => Array
(
[Invoice No] => FOCF219611
[Sale Type Desc] => CUSTOMER
[Misc Amt] => -0.02
[Misc Acc] => 8050
[Misc Acc Desc] => TOOLS & SUPPLIES - SERVICE
)
)
喜欢这个(因为我在评论里说了)
$str = 'Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................';
$f = fopen('php://temp', 'w+');
fwrite($f, $str);
rewind($f);
$headers = [];
$header = '';
while(false !== ($c = fgetc($f))){
if($c != '.'){
$header .= $c;
}elseif(!empty($header)){
$headers[] = trim($header);
$header = '';
}
}
print_r($headers);
产出
Array
(
[0] => Invoice No
[1] => Sale Type Desc
[2] => Misc Amt
[3] => Misc Acc
[4] => Misc Acc Desc
)
请注意,我这样做时没有使用偏移量,但我在评论中提到了它,我喜欢做这样奇怪的事情。挺好玩的。
当然你也可以这样做得到同样的结果:
$str = 'Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................';
print_r(array_filter(array_map('trim',explode('.', $str))));
但这远非易事。
如果你不喜欢这些古怪的按键,你可以在那个傻瓜上放一个 array_values。
print_r(array_values(array_filter(array_map('trim',explode('.', $str)))));
哈哈,又是一个星期一。
更新
您也可以使用文件流包装器来修复文件以供 CSV 读取。在 PHP5.4(我认为或 5.3)中缺少 SplFileObj fgetcsv
,我用了一个技巧来修补 class... :)
这就是我的观点(但还有很多我不知道的)
$str = 'Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................
somedata .... someother stuff ... foobar ... hello ... world..
';
//pretend this is a real file
$f = fopen('php://temp', 'w+');
fwrite($f, $str);
rewind($f);
$headers = [];
$num_headers = 0;
$i = 1;
while(false !== ($c = fgetcsv($f))){
//if there is only one element assume the delimiter is wrong
if(count($c) == 1){
//you could test the string for multiple delimiters and change
/*
if(strpos($c, '.')){
$regex = '/\.+/'
}else if(strpos($c, '~')){
$regex = '/~+/'
} etc....
*/
//use memory buffer to fix files with .'s but still read them as
//a normal CSV file, php://memory is really fast.
//and this gives us all the parsing benefits of fgetcsv
//you could use any delimiter here you want.
$fixed = trim(preg_replace('/\.+/', ',', $c[0]),',');
$m = fopen('php://memory', 'w+');
fwrite($m, $fixed);
rewind($m);
$c = fgetcsv($m);
}
//trim any spaces, not a bad idea anyway
$c = array_map('trim', $c);
//if no headers use the first line of file as the header
if(empty($headers)){
$headers = $c;
//count them (see below)
$num_headers = count($headers);
continue;
}
//array_combine is a good choice for header => values
//but the arrays have to be the same size
if(count($c) != $num_headers) die("missing dilimter on line {$i}");
$line = array_combine($headers, $c);
//continue with normal csv opperation
print_r($line);
++$i; //track the line number
}
输出
Array
(
[Invoice No] => somedata
[Sale Type Desc] => someother stuff
[Misc Amt] => foobar
[Misc Acc] => hello
[Misc Acc Desc] => world
)
更新
正如我在评论中提到的(在发现它是 HTML 之后)。您可以使用 DOM 解析器。我过去用过的一个是 PHPQuery
现在有点过时了。但这很好,因为您可以使用 jQuery 语法。例如说你有这个
<ul id="title" >
<li>header</li>
<li>header</li>
<li>header</li>
</ul>
你可以用这样的东西找到它(已经有一段时间了,所以如果这是错误的抱歉)
$length = $PHPQuery->find("#headers li")->lenght;
for($i=0;$i<$lenght;++$i){
echo $PHPQuery->find("#headers li:eq($i)")->text();
}
例如,您甚至可以使用 ->attr('href')
提取属性。基本上,您可以利用 HTML 结构并提取您需要的内容,而不是将其转换为文本并尝试删除一堆 "stuff"
干杯!
您可以创建一个像这样的函数来使用块大小进行拆分。
注意:由于 $fieldIndexes
数组中的每个大小不包括列之间的 space,因此我在每个长度 (15+1, 20+1, ...)
<?php
$headerString ="Invoice No..... Sale Type Desc...... Misc Amt.... Misc Acc.. Misc Acc Desc.....................................";
$fieldIndexes = [ 15+1, 20+1, 12+1, 10+1, 50+1];
function getParts($string, $positions){
$parts = array();
foreach ($positions as $position){
$parts[] = substr($string, 0, $position);
$string = substr($string, $position);
}
return $parts;
}
print_r(getParts($headerString, $fieldIndexes));
?>
结果:
Array
(
[0] => Invoice No.....
[1] => Sale Type Desc......
[2] => Misc Amt....
[3] => Misc Acc..
[4] => Misc Acc Desc.....................................
)