使用 Lists.partition 或 Iterable.partition 将集合拆分为子集
Split a set to sub sets using Lists.partition or Iterable.partition
我想知道将集合拆分为子集的有效方法是什么?
Iterable<List<Long>> partitions = Iterables.partition(numbers, 10);
或
List<List<Long>> partitions = Lists.partition(numbers, 10);
时间复杂度有什么不同?
谢谢
我们来看看内部实现
method partition() for iterables
529 public static <T> Iterable<List<T>> partition(final Iterable<T> iterable, final int size) {
530 checkNotNull(iterable);
531 checkArgument(size > 0);
532 return new FluentIterable<List<T>>() {
533 @Override
534 public Iterator<List<T>> iterator() {
535 return Iterators.partition(iterable.iterator(), size);
536 }
537 };
538 }
681 public static <T> List<List<T>> partition(List<T> list, int size) {
682 checkNotNull(list);
683 checkArgument(size > 0);
684 return (list instanceof RandomAccess)
685 ? new RandomAccessPartition<T>(list, size)
686 : new Partition<T>(list, size);
687 }
在上面的2个实现中,如果你分析代码,Lists
检查list instanceof RandomAccess
,如果是真的,然后使用RandomAccess接口评估分区。
现在如果我们看到 RandomAccess 的 documentation:
public interface RandomAccess
Marker interface used by List implementations to indicate that they support fast (generally constant
time) random access. The primary purpose of this interface is to allow
generic algorithms to alter their behavior to provide good performance
when applied to either random or sequential access lists.
所以,我猜后者的性能更好。
如果我们的 list
不是 RandomAccess
接口的实例怎么办?
Lists
使用的内核class进行分区(source)
689 private static class Partition<T> extends AbstractList<List<T>> {
690 final List<T> list;
691 final int size;
692
693 Partition(List<T> list, int size) {
694 this.list = list;
695 this.size = size;
696 }
697
698 @Override
699 public List<T> get(int index) {
700 checkElementIndex(index, size());
701 int start = index * size;
702 int end = Math.min(start + size, list.size());
703 return list.subList(start, end);
704 }
705
706 @Override
707 public int size() {
708 return IntMath.divide(list.size(), size, RoundingMode.CEILING);
709 }
710
711 @Override
712 public boolean isEmpty() {
713 return list.isEmpty();
714 }
715 }
和Iterators
使用的核心class执行分区(source)
605 private static <T> UnmodifiableIterator<List<T>> partitionImpl(
606 final Iterator<T> iterator, final int size, final boolean pad) {
607 checkNotNull(iterator);
608 checkArgument(size > 0);
609 return new UnmodifiableIterator<List<T>>() {
610 @Override
611 public boolean hasNext() {
612 return iterator.hasNext();
613 }
614 @Override
615 public List<T> next() {
616 if (!hasNext()) {
617 throw new NoSuchElementException();
618 }
619 Object[] array = new Object[size];
620 int count = 0;
621 for (; count < size && iterator.hasNext(); count++) {
622 array[count] = iterator.next();
623 }
624 for (int i = count; i < size; i++) {
625 array[i] = null; // for GWT
626 }
627
628 @SuppressWarnings("unchecked") // we only put Ts in it
629 List<T> list = Collections.unmodifiableList(
630 (List<T>) Arrays.asList(array));
631 return (pad || count == size) ? list : list.subList(0, count);
632 }
633 };
634 }
现在分析最后2个代码,如果不综合分析,真的很难有人知道哪个更好。但是,我们可以说,如果我们的列表是 RandomAccess
接口的实例,那么 Lists.partition(lists, pivot);
肯定更快。
我想知道将集合拆分为子集的有效方法是什么?
Iterable<List<Long>> partitions = Iterables.partition(numbers, 10);
或
List<List<Long>> partitions = Lists.partition(numbers, 10);
时间复杂度有什么不同?
谢谢
我们来看看内部实现
method partition() for iterables
529 public static <T> Iterable<List<T>> partition(final Iterable<T> iterable, final int size) {
530 checkNotNull(iterable);
531 checkArgument(size > 0);
532 return new FluentIterable<List<T>>() {
533 @Override
534 public Iterator<List<T>> iterator() {
535 return Iterators.partition(iterable.iterator(), size);
536 }
537 };
538 }
681 public static <T> List<List<T>> partition(List<T> list, int size) {
682 checkNotNull(list);
683 checkArgument(size > 0);
684 return (list instanceof RandomAccess)
685 ? new RandomAccessPartition<T>(list, size)
686 : new Partition<T>(list, size);
687 }
在上面的2个实现中,如果你分析代码,Lists
检查list instanceof RandomAccess
,如果是真的,然后使用RandomAccess接口评估分区。
现在如果我们看到 RandomAccess 的 documentation:
public interface RandomAccess
Marker interface used by List implementations to indicate that they support fast (generally constant time) random access. The primary purpose of this interface is to allow generic algorithms to alter their behavior to provide good performance when applied to either random or sequential access lists.
所以,我猜后者的性能更好。
如果我们的 list
不是 RandomAccess
接口的实例怎么办?
Lists
使用的内核class进行分区(source)
689 private static class Partition<T> extends AbstractList<List<T>> {
690 final List<T> list;
691 final int size;
692
693 Partition(List<T> list, int size) {
694 this.list = list;
695 this.size = size;
696 }
697
698 @Override
699 public List<T> get(int index) {
700 checkElementIndex(index, size());
701 int start = index * size;
702 int end = Math.min(start + size, list.size());
703 return list.subList(start, end);
704 }
705
706 @Override
707 public int size() {
708 return IntMath.divide(list.size(), size, RoundingMode.CEILING);
709 }
710
711 @Override
712 public boolean isEmpty() {
713 return list.isEmpty();
714 }
715 }
和Iterators
使用的核心class执行分区(source)
605 private static <T> UnmodifiableIterator<List<T>> partitionImpl(
606 final Iterator<T> iterator, final int size, final boolean pad) {
607 checkNotNull(iterator);
608 checkArgument(size > 0);
609 return new UnmodifiableIterator<List<T>>() {
610 @Override
611 public boolean hasNext() {
612 return iterator.hasNext();
613 }
614 @Override
615 public List<T> next() {
616 if (!hasNext()) {
617 throw new NoSuchElementException();
618 }
619 Object[] array = new Object[size];
620 int count = 0;
621 for (; count < size && iterator.hasNext(); count++) {
622 array[count] = iterator.next();
623 }
624 for (int i = count; i < size; i++) {
625 array[i] = null; // for GWT
626 }
627
628 @SuppressWarnings("unchecked") // we only put Ts in it
629 List<T> list = Collections.unmodifiableList(
630 (List<T>) Arrays.asList(array));
631 return (pad || count == size) ? list : list.subList(0, count);
632 }
633 };
634 }
现在分析最后2个代码,如果不综合分析,真的很难有人知道哪个更好。但是,我们可以说,如果我们的列表是 RandomAccess
接口的实例,那么 Lists.partition(lists, pivot);
肯定更快。